Predictive modeling and machine learning in R with the caret package

Table of contents Powerful and simplified modeling with caret Streamlined and consistent syntax for more than 200 different models Run all models with the train() function Easy data splitting Realistic model estimates through built-in resampling An R-squared from a model based on the full dataset is unrealistic An R-squared based on resampling is more realistic […]

Continue reading →

Tips and tricks for working with images and figures in R Markdown documents

Writing reports in R Markdown allows you to skip painful and error-prone copy-paste in favor of dynamically-generated reports written in R and markdown that are easily reproducible and updateable. R Markdown reports that are heavy on graphs and maps, though, can yield large HTML files that are not optimized for web viewing. R Markdown offers […]

Continue reading →

Using the new R package, FedData, to access federal open datasets (including interactive graphics)

The FedData package (created by R. Kyle Bocinsky) is a great new R package that provides easy access to some important federal datasets. The package is well-designed and provides functions to download climate, elevation, hydrography and other data for your area of interest. The following five sources of data are currently available for download with […]

Continue reading →

Manipulating and mapping US Census data in R using the acs, tigris and leaflet packages

Census data the hard way Plotting census data with ggplot2 Interactive plotting with the leaflet package Options 1: Convert the data.frame back to a SpatialPolygonsDataFrame Options 2: Make use of the existing SpatialPolygonsDataFrame Census data the easy(er) way 1) Set up the packages 2) Get the spatial data (tigris) 3) Get the tabular data (acs) […]

Continue reading →

Scrape website data with the new R package rvest (+ a postscript on interacting with web pages with RSelenium)

Copying tables or lists from a website is not only a painful and dull activity but it's error prone and not easily reproducible. Thankfully there are packages in Python and R to automate the process. In a previous post we described using Python's Beautiful Soup to extract information from web pages. In this post we […]

Continue reading →