Hello, OnlineGDB Q&A section lets you put your programming query to fellow community users. Asking a solution for whole assignment is strictly not allowed. You may ask for help where you are stuck. Try to add as much information as possible so that fellow users can know about your problem statement easily.

how to apply decison tree on a vector of strings ?

+11 votes
asked Jan 25, 2020 by Ra Ouf (220 points)
how to apply decison tree on a vector of strings ?

2 Answers

0 votes
answered Jun 25, 2024 by Mr Srikanth (190 points)
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split

# Example vector of strings
vector_of_strings = [
    "This is the first string",
    "Another string to process",
    "Final string for the example"
]
labels = [0, 1, 0]  # Example labels for classification

# Initialize TF-IDF vectorizer
vectorizer = TfidfVectorizer()

# Fit and transform the vector of strings
X = vectorizer.fit_transform(vector_of_strings)

# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.2, random_state=42)

# Initialize decision tree classifier
clf = DecisionTreeClassifier()

# Train the classifier
clf.fit(X_train, y_train)

# Predict on test data
y_pred = clf.predict(X_test)

# Evaluate accuracy or other metrics
accuracy = clf.score(X_test, y_test)
print(f"Accuracy: {accuracy}")
0 votes
answered Jun 27, 2024 by Jagadheesh K (160 points)

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split

# Example vector of strings
vector_of_strings = [
    "This is the first string",
    "Another string to process",
    "Final string for the example"
]
labels = [0, 1, 0]  # Example labels for classification

# Initialize TF-IDF vectorizer
vectorizer = TfidfVectorizer()

# Fit and transform the vector of strings
X = vectorizer.fit_transform(vector_of_strings)

# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.2, random_state=42)

# Initialize decision tree classifier
clf = DecisionTreeClassifier()

# Train the classifier
clf.fit(X_train, y_train)

# Predict on test data
y_pred = clf.predict(X_test)

# Evaluate accuracy or other metrics
accuracy = clf.score(X_test, y_test)
print(f"Accuracy: {accuracy}")

Welcome to OnlineGDB Q&A, where you can ask questions related to programming and OnlineGDB IDE and receive answers from other members of the community.
...