In this data-centric world we live in we need lots of tools in our data science tool kit. The kit should include expertise in spreadsheets, statistics, databases and machine learning. Math and statistics are the backbone of data analysis, machine learning and artificial intelligence, yet they are difficult to learn and even more difficult to retain. Other obstacles for learning and using statistical packages include the cost for commercial packages, such as SPSS, SAS and Stata and the fact that many stats courses demand long-hand calculation of statistical methods.
Enter jamovi . This is a free open-source statistical package project based on the R programming language. Jamovi has a lot of intuitive features that should appeal to the average budding data scientist and is a great choice for beginners.
Ease of use - jamovi is a desktop app available for Windows, Mac, Linux and Chrome operating systems. It will import .csv, xlsx, SAS, Stata, JASP and SPSS files. There is an online demo so potential users can test drive it before downloading. Files can be saved as .csv or .omv files that jamovi uses to store the file plus analysis. In other words, you can continue a prior analysis without losing data or work. Furthermore, having all of your work in one .omv file makes sharing easier via email or a data sharing platform such as Open Science Framework.
Based on R - the R syntax mode produces the actual R code used to analyze the data so it could be an educational tool for someone learning the R programming language. Generated R code could also be copied and pasted into R Studio as desired.
Spreadsheet functionality- there are two main menus once a file has been uploaded, the Data and Analysis menus. Under the Data menu, a user can examine the columns and rows, delete or add new ones. Data types have different icons for continuous, ordinal and nominal data. In addition, new columns can be created with computations or transformations. The compute option allows the user to add a mathematical calculation, such as mean or sum to a row or column. The transform option allows the user to e.g. log10 transform skewed data in one or more columns with a few clicks. Transforms can be saved like functions and used again. Missing data can be transformed by inserting the mean value of the predictor. Like spreadsheets, the filter option facilitates filtering the dataset e.g. selecting female subjects between the age of 20-50.
Analyses - the first option is Exploration and under that the first choice is Descriptive Statistics.
Descriptive statistics operates by clicking or dragging a predictor into a new window and deciding if you want to view stats further by dividing by a categorical variable such as gender or the presence or absence of heart disease. By default, jamovi automatically generates a table in a right-sided window displaying descriptive statistics. Scrolling down is a Statistics menu where we can add standard deviation, quartiles, mode, etc. Selecting the variable cholesterol, this automatically generates the following table.
Scroll down to Plots and select box plot and histogram and the two graphs below are generated. They can be copied and pasted or exported as PNGs or PDF. Note the images are APA quality, ready for publication.
Also under Exploration is the ability to generate a scatterplot and pareto chart.
In addition to the Exploration category, under Analyses the other options are 1. T-tests (independent, paired and single sample) 2. ANOVA (one-way, repeated measures, ANCOVA, MANCOVA and non-parametric choices) 3. Regression (linear regression, logistic regression [binomial, multinomial, ordinal]) 4. Frequencies (one sample proportion, contingency tables) 5. Factor (reliability analysis, principal component analysis, exploratory factor analysis)
Jamovi Library - there are multiple add-ons (modules) available: 1. Power analysis 2. General analysis for linear models 3. Example datasets 4. Confidence intervals and effect size 5. Meta-analysis and many more.
Educational materials - in addition to a robust jamovi Forum and sample datasets there are 1. Online user guide to get started with video tutorials 2. Free Learning Statistics with jamovi e-textbook of more than 400 pages. It is very well written and easy to understand 3. There is a 5 hour YouTube video that covers all of the important features and is associated with a free course.
Data science tools should be affordable and easy to use. Jamovi answers the mail on both accounts.
This blog was also posted on Medium.com https://medium.com/@rehoyt/jamovi-a-free-statistical-package-for-your-data-science-toolkit-370aecb81e3f