DS30


Q. 1) Create a XML file which gives details of books available in “Bookstore” from following
categories.

  1. Yoga
  2. Story
  3. Technical
    and elements in each category are in the following format <Book>

<Book_Title> --------------</Book_Title>
<Book_Author> ---------------</Book_Author>
<Book_Price> --------------</Book_Price>
</Book>
Save the file as “Bookcategory.xml” [Marks 15]
<Bookstore>
<Yoga>
<Book>
<Book_Title>Book1_Title</Book_Title>
<Book_Author>Book1_Author</Book_Author>
<Book_Price>Book1_Price</Book_Price>
</Book>
<Book>
<Book_Title>Book2_Title</Book_Title>
<Book_Author>Book2_Author</Book_Author>
<Book_Price>Book2_Price</Book_Price>
</Book>
<!-- Add more books in Yoga category if needed -->
</Yoga>
<Story>
<Book>
<Book_Title>Book3_Title</Book_Title>
<Book_Author>Book3_Author</Book_Author>
<Book_Price>Book3_Price</Book_Price>
</Book>
<Book>
<Book_Title>Book4_Title</Book_Title>
<Book_Author>Book4_Author</Book_Author>
<Book_Price>Book4_Price</Book_Price>
</Book>
<!-- Add more books in Story category if needed -->
</Story>
<Technical>
<Book>
<Book_Title>Book5_Title</Book_Title>
<Book_Author>Book5_Author</Book_Author>
<Book_Price>Book5_Price</Book_Price>
</Book>
<Book>
<Book_Title>Book6_Title</Book_Title>
<Book_Author>Book6_Author</Book_Author>
<Book_Price>Book6_Price</Book_Price>
</Book>
<!-- Add more books in Technical category if needed -->
</Technical>
</Bookstore>

Q. 2 ) Create the dataset . transactions = [['eggs', 'milk','bread'], ['eggs', 'apple'], ['milk', 'bread'], ['apple',
'milk'], ['milk', 'apple', 'bread']] .
Convert the categorical values into numeric format.Apply the apriori algorithm on the above dataset to
generate the frequent itemsets and association rules.
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules

Given dataset

transactions = [['eggs', 'milk','bread'],
['eggs', 'apple'],
['milk', 'bread'],
['apple', 'milk'],
['milk', 'apple', 'bread']]

te = TransactionEncoder()
te_ary = te.fit(transactions).transform(transactions)
df = pd.DataFrame(te_ary, columns=te.columns_)

frequent_itemsets = apriori(df, min_support=0.5, use_colnames=True)

rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)