DS5


Q. 1) Create XML file named “Item.xml”with item-name, item-rate, item quantity Store the details of 5
Items of different Types
[Marks 15]

<?xml version="1.0" encoding="UTF-8"?> <Items> <Item> <ItemName>Apple</ItemName> <ItemRate>2.50</ItemRate> <ItemQuantity>10</ItemQuantity> </Item> <Item> <ItemName>Orange</ItemName> <ItemRate>3.00</ItemRate> <ItemQuantity>15</ItemQuantity> </Item> <Item> <ItemName>Banana</ItemName> <ItemRate>1.50</ItemRate> <ItemQuantity>20</ItemQuantity> </Item> <Item> <ItemName>Mango</ItemName> <ItemRate>4.00</ItemRate> <ItemQuantity>12</ItemQuantity> </Item> <Item> <ItemName>Grapes</ItemName> <ItemRate>3.50</ItemRate> <ItemQuantity>8</ItemQuantity> </Item> </Items>

Q. 2)Use the iris dataset. Write a Python program to view some basic statistical details like percentile,
mean, std etc. of the species of 'Iris-setosa', 'Iris-versicolor' and 'Iris-virginica'. Apply logistic regression
on the dataset to identify different species (setosa, versicolor, verginica) of Iris flowers given just 4
features: sepal and petal lengths and widths.. Find the accuracy of the model.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_iris

Load the iris dataset

iris_data = load_iris()
X = iris_data.data # Features (sepal length, sepal width, petal length, petal width)
y = iris_data.target # Target variable (species)

Convert the dataset into a DataFrame

iris_df = pd.DataFrame(X, columns=iris_data.feature_names)
iris_df['Species'] = y

Filter data for three species: 'setosa', 'versicolor', 'virginica'

setosa_data = iris_df[iris_df['Species'] == 0]
versicolor_data = iris_df[iris_df['Species'] == 1]
virginica_data = iris_df[iris_df['Species'] == 2]

View basic statistical details for each species

print("Basic Statistical Details for Iris Setosa:")
print(setosa_data.describe())
print("\nBasic Statistical Details for Iris Versicolor:")
print(versicolor_data.describe())
print("\nBasic Statistical Details for Iris Virginica:")
print(virginica_data.describe())

Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Build a logistic regression model

model = LogisticRegression()

Train the model

model.fit(X_train, y_train)

Make predictions

y_pred = model.predict(X_test)

Calculate accuracy of the model

accuracy = accuracy_score(y_test, y_pred)
print("\nAccuracy of the Logistic Regression model:", accuracy)