by V.V.S.Manikanta Prasad
What can machine learning be used for?
Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. Machine learning focuses on the development of computer programs that can change when exposed to new data.
Ultimately, the programming language you use for machine learning should consider your own requirements and predilections. No one can meaningfully address those concerns for you.
What Languages Are Being Used
Before I give you my opinion, it is good to have a look around to see what languages and platforms are popular in self-selected communities of data analysis and machine learning professionals.
The survey results suggest heavy use of R and Python and SQL for data access.
SAS and MATLAB rank higher than I would have expected. I’d expect SAS accounts for larger corporate (Fortune 500) data analysis and MATLAB for engineering, research and student use.
The results suggested the abundant use of R and show good use of MATLAB and SAS with much lower Python representation. I can attest that I prefer R over Python for competition work. It just feels though it has more on offer in terms of data analysis and algorithm selection.
Python is fragmented by comprehensive and can be very slow unless you drop into C. When Python is not working with a well-defined feature matrix and uses Pandas and NLTK.
The pie chart gives more detail on the use of programming languages and suggests another category that is as close to as large as large as the usage of R.
Machine Learning Languages
I think of programming languages in the context of the machine learning activities I want to perform.
I think MATLAB is excellent for representing and working with matrices. As such, I think it’s an excellent language or platform to use when climbing into the linear algebra of a given method.
R is a workhorse for statistical analysis and by extension machine learning. It is the platform to use to understand and explore your data using statistical methods and graphs. It has an enormous number of machine learning algorithms, and advanced implementations to written by the developers of the algorithm.
Python if a popular scientific language and a rising star for machine learning. I’d be surprised if it can take the data analysis mantle from R, but matrix handling in NumPy may challenge MATLAB and communication tools like IPython are very attractive and a step into the future of reproducibility.
I think the SciPy stack for machine learning and data analysis can be used for one-off projects (like papers), and frameworks like scikit-learn are mature enough to be used in production systems.
Machine learning is algorithms, not magic. When it comes to serious production implementations, you need a robust library or you customize an implementation of the algorithm for your needs.
I think we may prototype in R or Python, but you will implement in a heavier language for reasons such as execution speed and system reliability. For example, the backend of BigML is implemented in Clojure.