GitHub - MrLeap/predicto: A program for rapidly testing hunches for machine learning projects.

This is a script that'll accept a CSV file and create a cross validated classification model for it.

Written in Python 2.7 for no particular reason. It may work in 3.* anyways, haven't tested that.

First, I created a simple setup that loads every feature as a categorical variable, liberally discards rows it doesn't understand and one-hots everything. I was drawn towards creating this as a simple diagnostic script to run on any mixed dataset, so I didn't want to hardcode any feature masks/selectors. I added sparse scaling and naive dimensionality reduction to the pipeline. Sparse scaling would normalize the database ID that remained as a feature. I wanted to help it get blown out during the DR step. DR would also hopefully cull any other abhorrently superfluous one-hot features. After a few different runs I settled on using a multi-layer perceptron.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
main.py		main.py
predict.py		predict.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

MrLeap/predicto

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages