DS5
Q. 1) Create XML file named “Item.xml”with item-name, item-rate, item quantity Store the details of 5
Items of different Types
[Marks 15]
Q. 2)Use the iris dataset. Write a Python program to view some basic statistical details like percentile,
mean, std etc. of the species of 'Iris-setosa', 'Iris-versicolor' and 'Iris-virginica'. Apply logistic regression
on the dataset to identify different species (setosa, versicolor, verginica) of Iris flowers given just 4
features: sepal and petal lengths and widths.. Find the accuracy of the model.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_iris
Load the iris dataset
iris_data = load_iris()
X = iris_data.data # Features (sepal length, sepal width, petal length, petal width)
y = iris_data.target # Target variable (species)
Convert the dataset into a DataFrame
iris_df = pd.DataFrame(X, columns=iris_data.feature_names)
iris_df['Species'] = y
Filter data for three species: 'setosa', 'versicolor', 'virginica'
setosa_data = iris_df[iris_df['Species'] == 0]
versicolor_data = iris_df[iris_df['Species'] == 1]
virginica_data = iris_df[iris_df['Species'] == 2]
View basic statistical details for each species
print("Basic Statistical Details for Iris Setosa:")
print(setosa_data.describe())
print("\nBasic Statistical Details for Iris Versicolor:")
print(versicolor_data.describe())
print("\nBasic Statistical Details for Iris Virginica:")
print(virginica_data.describe())
Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Build a logistic regression model
model = LogisticRegression()
Train the model
model.fit(X_train, y_train)
Make predictions
y_pred = model.predict(X_test)
Calculate accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print("\nAccuracy of the Logistic Regression model:", accuracy)