Visio AI Product Documentation
Visio AI is a comprehensive web application designed to empower users with intuitive tools for data analysis, visualization, and machine learning. Built with Streamlit, it provides a user-friendly interface to perform complex data science tasks without writing a single line of code.
1. Introduction to Visio AI
Visio AI aims to democratize data science, making powerful analytical capabilities accessible to everyone from beginners to seasoned data professionals. It streamlines the workflow from raw data upload to model training and evaluation, all within an interactive and visually appealing environment.
2. Key Features
2.1. Data Upload & Management
- Flexible Uploads: Supports CSV, XLSX, and TXT (tab-separated) formats.
- Instant Preview: View your original and updated datasets immediately after upload and modifications.
2.2. Data Preprocessing & Cleaning
2.2.1. Missing Value Handling
- Automated Imputation: One-click options for mean, median, mode, forward fill, or backward fill across all relevant columns.
- Manual Control: Granular control to select specific columns and apply precise imputation strategies.
2.3. Data Operations & Machine Learning
2.3.1. Target Variable Definition
- Identify the column you wish to predict.
- Automatic problem type detection (Classification or Regression) based on target variable characteristics.
2.3.2. Data Splitting (Train/Test)
- Adjust test set size for robust model validation.
- Set a random state for reproducible results.
- Automated internal preprocessing:
- Label Encoding for categorical features.
- Standard Scaling for numerical features.
2.3.3. Algorithm Selection
- Dynamically filtered list of appropriate machine learning algorithms based on the detected problem type.
- For Classification: Logistic Regression, Decision Tree Classifier, Random Forest Classifier, Support Vector Classifier (SVC), K-Nearest Neighbors Classifier, Gaussian Naive Bayes.
- For Regression: Linear Regression, Decision Tree Regressor, Random Forest Regressor, Support Vector Regressor (SVR), K-Nearest Neighbors Regressor.
2.3.4. Model Training & Evaluation
- Initiate model training with a single click.
- View comprehensive evaluation metrics:
- Classification: Accuracy, Precision, Recall, F1-Score, and an interactive Confusion Matrix heatmap.
- Regression: Mean Squared Error (MSE), R-squared (R2) score.
2.4. Data Visualization
- Pair Plots: Generate both static (Seaborn) and interactive (Plotly) pair plots for numerical columns, revealing relationships and distributions.
2.5. Utility Tools (Sidebar)
- Note -- Lite: A simple in-app notepad.
- WordCloud: Create visual word clouds from text data.
- Viz AI (img): (Future Feature) Advanced image-based visualization powered by AI.
- Calculator: A basic arithmetic calculator.
- Viz Editor: (Future Feature) Tools for advanced customization of visualizations.
- Viz Report: (Future Feature) Generate detailed analytical reports.
3. Technical Stack
Visio AI is built leveraging a robust set of open-source technologies, ensuring performance, flexibility, and scalability:
- Frontend/Framework: Streamlit
- Data Manipulation: Pandas
- Numerical Operations: NumPy
- Machine Learning: Scikit-learn
- Data Visualization: Matplotlib, Seaborn, Plotly Express
- Web Integration: Webbrowser module (for external links)
4. How to Use (Quick Start)
- Launch the Streamlit application (e.g., `streamlit run home.py`).
- Upload your dataset using the file uploader.
- Review the "Missing Values Report" and handle missing data using automatic or manual options.
- Open "Data Operations & Algorithms" expander.
- Select your target column and observe the detected problem type.
- Adjust test set size and random state, then train/test split.
- Choose your preferred Machine Learning Algorithm.
- Click "Train Model" to run the algorithm.
- Scroll down to "Machine Learning Operations" to view model performance metrics.
- Explore relationships in your data using the "Generate Pair Plot" button.
5. Future Enhancements
Jaiho Labs is continuously working to improve Visio AI. Planned features include:
- Advanced feature engineering techniques.
- Hyperparameter tuning and cross-validation options.
- Model deployment and prediction interfaces.
- Expanded visualization types and customization.
- More sophisticated AI-driven insights and recommendations.