Can you Guess a Cuisine from its Ingredients?

Cooking is sometimes used as a metaphor for data preparation in machine learning. In order to practice my skills in machine learning, I decided to look for a dataset related to cooking. After a quick research online, I found this Kaggle dataset. In this Kaggle competition, the game is to predict the category of a dish’s cuisine given a list of its ingredients.

Exploratory Data Analysis

Let’s begin by reading and joining the datasets downloaded on Kaggle.

Show code
library(tidyverse)
library(jsonlite)

train  <- fromJSON("input/train.json", flatten = TRUE) %>%
  mutate(cuisine = as.factor(cuisine))

test <- fromJSON("input/test.json", flatten = TRUE) %>%
  mutate(cuisine = NA, # Add cuisine variable
         cuisine = factor(cuisine, levels = c(levels(train$cuisine)))) #add levels

train_unnested <- train %>%
  tidyr::unnest(ingredients) %>%
  mutate(ingredients = str_replace_all(ingredients, "-","_"), # allow dash
         ingredients = str_remove_all(ingredients, "[^A-Za-z0-9_ ]"), #keep letters, spaces, dash
         type = "train")

test_unnested <- test %>%
  tidyr::unnest(ingredients) %>%
  mutate(ingredients = str_replace_all(ingredients, "-","_"), # allow dash
         ingredients = str_remove_all(ingredients, "[^A-Za-z0-9_ ]"), #keep letters, spaces, dash
         type = "test")

dataset <- full_join(train_unnested, test_unnested) %>%
  as_tibble()

dataset %>% arrange(id)
## # A tibble: 535,670 x 4
##       id cuisine ingredients                        type 
##    <int> <fct>   <chr>                              <chr>
##  1     0 spanish mussels                            train
##  2     0 spanish ground black pepper                train
##  3     0 spanish garlic cloves                      train
##  4     0 spanish saffron threads                    train
##  5     0 spanish olive oil                          train
##  6     0 spanish stewed tomatoes                    train
##  7     0 spanish arborio rice                       train
##  8     0 spanish minced onion                       train
##  9     0 spanish medium shrimp                      train
## 10     0 spanish fat free less sodium chicken broth train
## # ... with 535,660 more rows

Now we will realize some basic visualizations in order to better understand our dataset.

What is the repartition of our train dataset in terms of cuisine?

Show code
dataset %>%
  filter(type == "train") %>%
  group_by(cuisine) %>%
  summarize(n = n()) %>%
  ggplot(aes(x = cuisine, y = n)) + 
  geom_col() +
  coord_flip() +
  ggthemes::theme_economist_white(horizontal = FALSE) +
  theme(plot.background = element_rect(fill = "#f8f2e4")) +
  labs(x = "", y = "Number of Recipes",
       title = "Twenty cuisines",
       subtitle = "What's cooking?",
       caption = "Félix Luginbühl (@lgnbhl)\nData source: Kaggle, Yummly")

Italian and Mexican recipes are prevalent. We also notice a class imbalance, which should be taken into account when training the model.

Let’s explore the repartition of the number of ingredients per recipe for each cuisine in the train dataset.

Show code
dataset %>%
  filter(type == "train") %>%
  group_by(id, cuisine) %>%
  summarize(n = n()) %>%
  group_by(cuisine) %>%
  ggplot(aes(x = cuisine, y = n)) +
  geom_boxplot() + 
  coord_flip() +
  ggthemes::theme_economist_white(horizontal = FALSE) +
  theme(plot.background = element_rect(fill = "#f8f2e4")) +
  labs(x = "", y = "Number of ingredients per recipe",
       title = "Twenty cuisines",
       subtitle = "What's cooking?",
       caption = "Félix Luginbühl (@lgnbhl)\nData source: Kaggle, Yummly")

We see important outliers in each cuisine.

Another way of comparing the variation of the number of ingredients per recipe by cuisine is to calculate the coefficient of variation (CV)1 of each cuisine.

Show code
library(raster)

dataset %>%
  filter(type == "train") %>%
  group_by(id, cuisine) %>%
  summarise(n = n()) %>%
  group_by(cuisine) %>%
  summarise(mean = mean(n),
            cv = cv(n)) %>%
  arrange(desc(cv))
## # A tibble: 20 x 3
##    cuisine       mean    cv
##    <fct>        <dbl> <dbl>
##  1 brazilian     9.52  58.4
##  2 japanese      9.74  43.6
##  3 british       9.71  42.9
##  4 mexican      10.9   42.8
##  5 french        9.82  42.2
##  6 vietnamese   12.7   41.5
##  7 southern_us   9.63  40.2
##  8 spanish      10.4   39.9
##  9 irish         9.30  39.8
## 10 russian      10.2   39.6
## 11 indian       12.7   39.5
## 12 jamaican     12.2   39.0
## 13 filipino     10     38.6
## 14 italian       9.91  38.4
## 15 moroccan     12.9   37.2
## 16 greek        10.2   36.6
## 17 cajun_creole 12.6   36.6
## 18 thai         12.5   35.2
## 19 korean       11.3   34.4
## 20 chinese      12.0   33.7

We see that the Brazilian cuisine have much more variation (CV of 58.4%) than the Chinese cuisine (CV of 33.7%) relative to their means.

How many unique ingredients do we have in the dataset?

Show code
dataset %>%
  distinct(ingredients) %>%
  count()
## # A tibble: 1 x 1
##       n
##   <int>
## 1  7135

This is a lot of ingredients. We will have to do features reduction.

What are the top 100 ingredients of the full dataset?

Show code
library(wordcloud)
set.seed(1234)

dataset %>%
  count(ingredients, sort = TRUE) %>%
  with(wordcloud(ingredients, n, max.words = 100, scale = c(6, .1), colors = brewer.pal(6, 'Dark2')))

What about the top 10 ingredients by cuisine?

Show code
library(DT)

top10 <- train_unnested %>% 
  group_by(cuisine) %>%
  count(ingredients, sort = TRUE) %>% 
  top_n(10) %>%
  arrange(desc(cuisine))

DT::datatable(
  top10,
  options = list(pageLength = 5, dom = "ftpi"),
  rownames = FALSE,
  caption = "Top 10 Ingredients by Cuisine")

This table will lead us, in the feature engineering step, to count the number of ingredients “typical” from different cultural regions.

Preparing the Data

First, we want to tidy our dataset. We will use the unnest_tokens function from {tidytext} before extracting the word stems with {SnowballC}.

Show code
library(tidytext)
library(SnowballC)

dataset_tidy <- dataset %>%
  tidytext::unnest_tokens(word, ingredients, drop = FALSE) %>%
  anti_join(stop_words) %>%
  mutate(wordStem = SnowballC::wordStem(word)) #Stemming

dataset_tidy %>% arrange(id)
## # A tibble: 994,837 x 6
##       id cuisine ingredients         type  word    wordStem
##    <int> <fct>   <chr>               <chr> <chr>   <chr>   
##  1     0 spanish mussels             train mussels mussel  
##  2     0 spanish ground black pepper train ground  ground  
##  3     0 spanish ground black pepper train black   black   
##  4     0 spanish ground black pepper train pepper  pepper  
##  5     0 spanish garlic cloves       train garlic  garlic  
##  6     0 spanish garlic cloves       train cloves  clove   
##  7     0 spanish saffron threads     train saffron saffron 
##  8     0 spanish saffron threads     train threads thread  
##  9     0 spanish olive oil           train olive   oliv    
## 10     0 spanish olive oil           train oil     oil     
## # ... with 994,827 more rows

As you can see, we have now the ingredient words in the word and the stem word in wordStem. Note that “soy” became “soi” with the stemming.

We will now build a document-term matrix with the cast_dtm function of {tidytext}. We will performe feature reduction by removing the less frequent words.

Show code
library(tm)

dataset_dtm <- dataset_tidy %>%
  count(id, wordStem, sort = TRUE) %>%
  cast_dtm(id, wordStem, n, weighting = tm::weightTf) #tm:weightTfIdf also possible

dataset_dtm <- dataset_dtm %>%
  tm::removeSparseTerms(sparse = 0.999) %>%
  as.matrix() %>%
  as.data.frame() %>%
  rownames_to_column("id") %>%
  mutate(id = as.numeric(id)) %>%
  inner_join(dataset_tidy %>%
               select(id, type, cuisine), by = "id") %>%
  mutate(cuisine = as.factor(cuisine)) %>%
  distinct() %>%
  select(cuisine, id, type, everything()) %>%
  as_tibble()

dataset_dtm %>% arrange(id)
## # A tibble: 49,718 x 803
##    cuisine        id type  ground pepper fresh  seed chicken   dri  sauc
##    <fct>       <dbl> <chr>  <dbl>  <dbl> <dbl> <dbl>   <dbl> <dbl> <dbl>
##  1 spanish        0. train     1.     1.    0.    0.      1.    0.    0.
##  2 mexican        1. train     3.     4.    1.    0.      0.    0.    0.
##  3 french         2. train     0.     1.    0.    0.      1.    0.    0.
##  4 chinese        3. train     0.     1.    1.    0.      0.    0.    1.
##  5 italian        4. train     0.     0.    0.    0.      0.    0.    0.
##  6 <NA>           5. test      1.     0.    0.    0.      0.    1.    1.
##  7 chinese        6. train     0.     0.    0.    0.      0.    0.    1.
##  8 <NA>           7. test      4.     1.    1.    0.      1.    0.    0.
##  9 french         8. train     0.     0.    0.    0.      0.    0.    0.
## 10 southern_us    9. train     0.     1.    0.    0.      0.    2.    0.
## # ... with 49,708 more rows, and 793 more variables

Nice! We have a dataset of 803 variables.

Feature Engineering

Now some quick feature engineering. We will add a nIng variable, which give the number of ingredients by recipe. We will also create three variables that count the number of ingredients “typical” from three large regions: Asian, South and North ingredients. The idea comes from this other Kernel.

Show code
sumIng <- dataset_tidy %>%
  group_by(id) %>%
  count(id) %>%
  rename(nIng = n)

dataset_dtm <- dataset_dtm %>%
  full_join(sumIng, by = "id")

dataset_dtm %>%
  select(cuisine, id, nIng, type, everything()) %>%
  arrange(id)

#ref: https://www.kaggle.com/alonalevy/cultural-diffusion-by-recipes
north_cuisine <- c("british", "irish", "southern_us", "russian", "french")
south_cuisine <- c("brazilian", "cajun_creole", "greek", "indian", "italian", "jamaican", "mexican", "moroccan", "spanish")
asian_cuisine <- c("filipino")

ingredients_north <- dataset_tidy %>%
  select(id, cuisine, wordStem) %>%
  filter(cuisine %in% north_cuisine)

ingredients_south <- dataset_tidy %>%
  select(id, cuisine, wordStem) %>%
  filter(cuisine %in% south_cuisine)

ingredients_asian <- dataset_tidy %>%
  select(id, cuisine, wordStem) %>%
  filter(cuisine %in% asian_cuisine)

common_north <- ingredients_north %>%
  anti_join(ingredients_south, by = "wordStem") %>%
  anti_join(ingredients_asian, by = "wordStem")

common_south <- ingredients_south %>%
  anti_join(ingredients_north, by = "wordStem") %>%
  anti_join(ingredients_asian, by = "wordStem")

common_asian <- ingredients_asian %>%
  anti_join(ingredients_north, by = "wordStem") %>%
  anti_join(ingredients_south, by = "wordStem")

# Now let’s add three variables that count the number of occurrences of
# each of this regional ingredients by recipe.

dataset_tidy2 <- dataset_tidy %>%
  mutate(north_ing = ifelse(dataset_tidy$wordStem %in% common_north$wordStem, 1, 0)) %>%
  mutate(south_ing = ifelse(dataset_tidy$wordStem %in% common_south$wordStem, 1, 0)) %>%
  mutate(asian_ing = ifelse(dataset_tidy$wordStem %in% common_asian$wordStem, 1, 0)) %>%
  select(cuisine, north_ing, south_ing, asian_ing, everything())

# count the regional wordSteam by id
nAsian_ingredients <- dataset_tidy2 %>%
  select(id, asian_ing) %>%
  filter(asian_ing == 1) %>%
  count(id) %>%
  rename(nAsian_ing = n)

nSouth_ingredients <- dataset_tidy2 %>%
  select(id, south_ing) %>%
  filter(south_ing == 1) %>%
  count(id) %>%
  rename(nSouth_ing = n)

nNorth_ingredients <- dataset_tidy2 %>%
  select(id, north_ing) %>%
  filter(north_ing == 1) %>%
  count(id) %>%
  rename(nNorth_ing = n)

# We can now join these three variables to our main dataset.
dataset_final <- dataset_dtm %>%
  full_join(nAsian_ingredients, by = "id") %>%
  full_join(nSouth_ingredients, by = "id") %>%
  full_join(nNorth_ingredients, by = "id") %>%
  # replace NA by 0 in the new variables
  mutate_at(vars(c(nAsian_ing, nNorth_ing, nSouth_ing)), funs(replace(., is.na(.), 0)))

write_csv(dataset_final, "output/dataset_final.csv") #save dataset

dataset_final %>%
  select(cuisine, id, nIng, nAsian_ing, nNorth_ing, nSouth_ing, everything()) %>%
  arrange(id)
## # A tibble: 49,718 x 807
##    cuisine        id  nIng nAsian_ing nNorth_ing nSouth_ing type  ground
##    <fct>       <dbl> <int>      <dbl>      <dbl>      <dbl> <chr>  <dbl>
##  1 spanish        0.    25         0.         0.         0. train     1.
##  2 mexican        1.    36         0.         0.         0. train     3.
##  3 french         2.    26         0.         1.         0. train     0.
##  4 chinese        3.    19         0.         0.         0. train     0.
##  5 italian        4.     8         0.         0.         0. train     0.
##  6 <NA>           5.    17         0.         0.         0. test      1.
##  7 chinese        6.     9         0.         0.         0. train     0.
##  8 <NA>           7.    31         0.         0.         1. test      4.
##  9 french         8.    16         0.         0.         0. train     0.
## 10 southern_us    9.    19         0.         0.         0. train     0.
## # ... with 49,708 more rows, and 799 more variables

Machine learning with rpart and xgboost

Let’s begin by building a simple tree. As the game is to work in the tidy way, I tried to find an equivalent of {rpart} in the tidyverse ecosystem. Sadly, {broom} doesn’t work with classification trees and the {broomstick} package, which aims to implement them, is still in development. So we will do this in the traditional way.

Show code
# splitting the `train` dataset using {rsample}
library(rsample)
set.seed(1234)

train_test_split <- dataset_final %>%
  filter(type == "train") %>%
  initial_split(prop = 0.8)

# Retrieve train and test sets
train_tbl <- training(train_test_split) %>% select(-type)
test_tbl  <- testing(train_test_split) %>% select(-type)

library(rpart)
library(rpart.plot)
set.seed(1234)

cartModel <- rpart(cuisine ~ ., data = train_tbl, method = "class")

prp(cartModel)

The plot reveals the most importants ingredients of the model.

The accuracy of our classification model is “only” of 42%.

Show code
library(caret)

cartPredict <- predict(cartModel, newdata = test_tbl, type = "class")

cartCM <- caret::confusionMatrix(cartPredict, test_tbl$cuisine)

print(cartCM$overall)
##       Accuracy          Kappa  AccuracyLower  AccuracyUpper   AccuracyNull 
##      0.4181544      0.3280825      0.4072832      0.4290856      0.2000251 
## AccuracyPValue  McnemarPValue 
##      0.0000000            NaN

We will try again, but this time with one of the algorithm which dominates Kaggle competitions: XGboost. XGBoost implements gradient boosted decision trees. It is very fast and handle well overfitting.

Show code
library(xgboost)
set.seed(1234)

train_matrix <- xgb.DMatrix(as.matrix(select(train_tbl, -cuisine)), 
                            label = as.numeric(train_tbl$cuisine)-1) #feature index starts from 0

# train our multiclass classification model using softmax
# default parameters, with a maximum depth of 25.
xgbModel <- xgboost(train_matrix, 
                    max.depth = 25, 
                    nround = 10, 
                    objective = "multi:softmax", 
                    num_class = 20)

    ## [1]  train-merror:0.221873 
    ## [2]  train-merror:0.178064 
    ## [3]  train-merror:0.150189 
    ## [4]  train-merror:0.127278 
    ## [5]  train-merror:0.108580 
    ## [6]  train-merror:0.093212 
    ## [7]  train-merror:0.078661 
    ## [8]  train-merror:0.067410 
    ## [9]  train-merror:0.057668 
    ## [10] train-merror:0.049151

xgbPredict <- predict(xgbModel, newdata = as.matrix(select(test_tbl, -cuisine)))

# change cuisine back to string
xgbPredictText <- levels(test_tbl$cuisine)[xgbPredict + 1]

# for a tidy confusion matrix, the `conf_mat` function of {yardstick} can also be used.
confMat <- confusionMatrix(as.factor(xgbPredictText), test_tbl$cuisine)

confMat$overall
##       Accuracy          Kappa  AccuracyLower  AccuracyUpper   AccuracyNull 
##      0.7607493      0.7318283      0.7512162      0.7700910      0.2000251 
## AccuracyPValue  McnemarPValue 
##      0.0000000            NaN

We can visualize the confusion matrix of the model using {ggplot2}.

Show code
#ref: https://stackoverflow.com/questions/37897252
confMat$table %>%
  as.data.frame() %>%
  ggplot(aes(x = Prediction, y = Reference)) +
  geom_tile(aes(fill = Freq)) +
  geom_text(aes(label = sprintf("%1.0f", Freq)), vjust = 1) +
  scale_fill_gradient(low = "blue", high = "red", trans = "log") +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold", size = 16),
        plot.caption = element_text(colour = "dimgrey"),
        axis.text.x = element_text(angle = 45, vjust = 1, hjust=1),
        plot.background = element_rect(fill = "#f8f2e4")) +
  labs(title = "Confusion matrix",
       subtitle = "What's cooking?",
       fill = "Freq (log)",
       caption = "Félix Luginbühl (@lgnbhl)\nData source: Kaggle, Yummly")

The confusion matrix reveals the class imbalance. What about the sensitivity (true positive rate) results by class?

Show code
confMat$byClass %>%
  as.data.frame() %>%
  rownames_to_column("cuisine") %>%
  select(cuisine, Sensitivity, Specificity) %>%
  arrange(Sensitivity)
##                cuisine Sensitivity Specificity
## 1       Class: british   0.3221477   0.9932095
## 2       Class: russian   0.3673469   0.9978360 
## 3       Class: spanish   0.3846154   0.9938136   
## 4         Class: irish   0.4472050   0.9947389   
## 5      Class: filipino   0.4936709   0.9955105   
## 6        Class: french   0.5615079   0.9707383    
## 7     Class: brazilian   0.5773196   0.9973272  
## 8    Class: vietnamese   0.5785714   0.9930893   
## 9      Class: japanese   0.6134185   0.9938490    
## 10        Class: greek   0.6923077   0.9940515    
## 11     Class: jamaican   0.7019231   0.9978344    
## 12 Class: cajun_creole   0.7046980   0.9912487    
## 13     Class: moroccan   0.7218543   0.9956427  
## 14       Class: korean   0.7245509   0.9964043   
## 15         Class: thai   0.7370130   0.9905833    
## 16  Class: southern_us   0.7784091   0.9549053    
## 17      Class: chinese   0.8241563   0.9806521    
## 18      Class: italian   0.8818353   0.9410655   
## 19       Class: indian   0.9037801   0.9875203   
## 20      Class: mexican   0.9097331   0.9736527    

The sensibility metric shows that the British cuisine is correctly identified as such only one third of the time (32%), followed by Russian and Spanish cuisine.

Finally let’s look at the 20 most important features of the model.

Show code
names <- colnames(select(train_tbl, -cuisine))
importance_matrix <- xgb.importance(names, model = xgbModel)

xgb.ggplot.importance(importance_matrix[1:30,]) +
  theme_bw() +
  theme(legend.background = element_blank(),
        legend.key = element_blank(),
        plot.title = element_text(size = 14, face = "bold"),
        plot.background = element_rect(fill = "#f8f2e4")) +
  labs(title = "Feature importance",
       subtitle = "What's cooking?",
       caption = "Félix Luginbühl (@lgnbhl)\nData source: Kaggle, Yummly")

The most important feature to predict the cuisine is the word tortilla, followed by soi (i.e. soja), parmesan and the number of ingredients in the recipe.

It would be frustrating to stop here. Let’s submit our work on Kaggle to know how well our model “really” performed.

Submitting on Kaggle

To submit our predictions on Kaggle, let’s use the same code as before but with the full training dataset. We will run xggoost, this time with 120 rounds. It will take some time! Let’s calculate how much.

Show code
training <- dataset_final %>%
  filter(type == "train") %>%
  select(-type)

testing <- dataset_final %>%
  filter(type == "test") %>%
  select(-type)

testing <- testing %>%
  mutate(cuisine = "NA") %>%
  mutate(cuisine = factor(cuisine, levels = c(levels(training$cuisine)))) #add levels

library(xgboost)
set.seed(1234)

train_matrix <- xgb.DMatrix(as.matrix(select(training, -cuisine)), 
                            label = as.numeric(training$cuisine)-1)

start_time <- Sys.time()

xgbModel2 <- xgboost(train_matrix, 
                    max.depth = 25, 
                    nround = 120, 
                    objective = "multi:softmax", 
                    num_class = 20)

end_time <- Sys.time()
time_taken <- end_time - start_time

time_sec <- difftime(end_time, start_time, units = "secs")

print(time_taken)
## Time difference of 4.359697 hours

To build the xgboost model, my computer worked more than 4 hours!

We can create and submit the CSV file on Kaggle.

Show code
# predict and change cuisine back to string
xgb.submit <- predict(xgbModel2, newdata = as.matrix(select(test_tbl, -cuisine)))
xgb.submit.text <- levels(testing$cuisine)[xgb.submit + 1]

# submission
sample <- read.csv('input/sample_submission.csv')
predictions <- data_frame(id = as.integer(testing$id), cuisine = as.factor(xgb.submit.text))

sample %>%
  select(id) %>%
  inner_join(predictions) %>%
  write_csv("xgboost_submission.csv")

We have a accuracy score of 0.7871. Nice!

Using h2o’s AutoML Algorithm

Could we have a better score on Kaggle using the new h2o’s Automatic machine learning function?

Let’s split the training dataset again and run the h2o.automl during the same time of xgboost.

Show code
library(h2o)
h2o.init() # max_mem_size = "8g"

training_h2o <- as.h2o(training)

split_h2o <- h2o.splitFrame(training_h2o, 0.8, seed = 1234)

train_h2o <- h2o.assign(split_h2o[[1]], "train" ) # 80%
valid_h2o <- h2o.assign(split_h2o[[2]], "valid" ) # 20%

target <- "cuisine"
predictors <- setdiff(names(train_h2o), target)

start_time_h2o <- Sys.time()

aml <- h2o.automl(x = predictors, 
                  y = target, 
                  training_frame = train_h2o,
                  leaderboard_frame = valid_h2o,
                  balance_classes = TRUE,
                  max_runtime_secs = as.numeric(time_sec),
                  seed = 1234
                  )

end_time_h2o <- Sys.time()

time_taken_h2o <- end_time_h2o - start_time_h2o

print(time_taken_h2o)
## Time difference of 8.775196 hours

Oops!

Our function ran twice the time of xgboost (almost 9 hours). It was expected that max_runtime_secs would compute all the running time of the function. But it doesn’t. The time of the computation actually doubled. Good to know for next time!

Finally, let’s extract the leader model, predict the class on the testing dataset and create the CSV file for the Kaggle submission. What is the result on Kaggle of our Stacked Ensemble model?

Show code
# Extract leader model
automl_leader <- aml@leader

# Predict on test set
testing_h2o <- as.h2o(testing)

pred_conversion <- h2o.predict(object = automl_leader, newdata = testing_h2o)

pred_conversion <- as.data.frame(pred_conversion)

predictions <- testing %>%
  mutate(pred = as.character(as.list(pred_conversion[1]))) %>%
  mutate(pred = as.factor(pred)) %>%
  select(id, pred)

sample %>%
  select(id) %>%
  inner_join(predictions) %>%
  rename(cuisine = pred) %>%
  mutate(cuisine = as.factor(cuisine)) %>%
  mutate(id = as.integer(id)) %>%
  write_csv("h2o_submission.csv")

The accuracy score of our model is 0.79314. It is better than our previous xgboost model (0.7871). However, it is hard to judge in terms of algorithm performance, as h2o’s AutoML ran twice the time of xgboost on my computer.

Thanks for reading!

  • For updates of recent blog posts, follow me on Twitter.
  • For reproducing my data analysis, go on my Github page.
  • Curious about what I can do for your organisation? Have a look my services.
  1. the sample standard deviation should not be used as the mean of the ingredients by recipes varies according to the cuisine. 

Leave a Comment