Jupyter Notebook Publisher
Publish and share beautiful Jupyter notebooks
#4fb4d7
Theme mode
Light mode
Dark mode
Code highlighter
An old hope
Notebook title
Author name
Upload your notebook file to finish
📖 Notebook
Let's filter the dataset further, so that 45% are one-worded foods, 30% are two-worded, and 25% are three-worded.
# shuffle the 2-worded and 3-worded foods since we'll be slicing them
two_worded_foods = two_worded_foods.sample(frac=1)
three_worded_foods = three_worded_foods.sample(frac=1)
# append the foods together
foods = one_worded_foods.append(two_worded_foods[:round(total_num_foods * 0.30)]).append(three_worded_foods[:round(total_num_foods * 0.25)])
# print the resulting sizes
for i in range(3):
print(f"{i+1}-worded food entities:", foods[foods.str.split().apply(len) == i + 1].size)
1-worded food entities: 1258 2-worded food entities: 839 3-worded food entities: 699
Copyright © 2021 HostJupyter