DS7
Q. 1) Write a PHP script to read “Movie.xml” file and print all MovieTitle and ActorName of file using
DOMDocument Parser. “Movie.xml” file should contain following information with at least 5 records
with values. MovieInfoMovieNo, MovieTitle, ActorName ,ReleaseYear [Marks 15]
Q. 2)Download the Market basket dataset. Write a python program to read the dataset and display its
information. Preprocess the data (drop null values etc.) Convert the categorical values into numeric
format. Apply the apriori algorithm on the above dataset to generate the frequent itemsets and association
rules.
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules
Read the dataset
df = pd.read_csv("market_basket.csv")
Drop null values
df.dropna(inplace=True)
Convert categorical values into numeric format
te = TransactionEncoder()
te_ary = te.fit(df.values).transform(df.values)
df_encoded = pd.DataFrame(te_ary, columns=te.columns_)
Apply Apriori algorithm to generate frequent itemsets
frequent_itemsets = apriori(df_encoded, min_support=0.05, use_colnames=True)
Generate association rules
rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.2)
Display frequent itemsets and association rules
print("Frequent Itemsets:")
print(frequent_itemsets)
print("\nAssociation Rules:")
print(rules)