Technical Tidbits From Spatial Analysis & Data Science

Menu

Skip to content
  • Home
  • Intro R workshop
  • Spatial R workshop
  • Machine Learning R workshop
  • Home
  • Data Solutions
  • Training
  • Research
  • About Us
  • Tech Blog
  • Contact us

Category: R

Unhide hidden data using jitter in the R package ggplot2

Posted on May 5, 2014 by zev@zevross.com · 2 Comments

When you’re plotting a lot of data overplotting can sometimes obscure important patterns. In situations like this it can be useful at the exploratory data analysis phase to ‘jitter’ the data so that underlying data can be viewed making it easier to see patterns. In addition, jittering can be a way of ‘anonymizing’ spatial data […]

Continue reading →

For large tables in R dplyr's function inner_join() is much faster than merge()

Posted on April 30, 2014 by zev@zevross.com

Using the merge() function in R on big tables can be time consuming. Luckily the join functions in the new package dplyr are much faster. The package offers four different joins: inner_join (similar to merge with all.x=F and all.y=F) left_join (similar to merge with all.x=T and all.y=F) semi_join (not really an equivalent in merge() unless […]

Continue reading →

R, Python, PostgreSQL (and more): A data science workflow example

Posted on April 29, 2014 by zev@zevross.com

Although many data science-related projects can be completed with a single software tool we often find that decisions about what tool to use for a project involve weighing a combination of what tool would be “best” for the job, what tools we're most familiar with and whether we already have scripts we can use. As […]

Continue reading →

Using R to quickly create an interactive online map using the leafletR package

Posted on April 11, 2014 by zev@zevross.com · 3 Comments

In a recent post (which you can find here) we identified the first publish date for all spatial packages listed in the Analysis of Spatial Data Task View on the R website. The most recent of these, published in March 2014, is the leafletR package by Christian Graul. We were surprised and impressed that, if […]

Continue reading →

Interactive visualization: from R to D3 using rCharts

Posted on April 3, 2014 by zev@zevross.com · 1 Comment

Data Driven Documents, or D3 for short, is an incredible JavaScript library for creating interactive data visualization on the web. Earlier this year, for example, we illustrated the power of D3 by interactively linking maps and charts in this visualization. D3, however, can be challenging to work with, especially if you don't have experience with […]

Continue reading →

Four reasons why you should check out the R package dplyr

Posted on March 26, 2014 by zev@zevross.com · 2 Comments

The R package dplyr, written by Hadley Wickham is only a few months old but has already become an important part of our data analysis/manipulation workflow, replacing functions that we have used for years. There are several reasons why dplyr is such a valuable tool but most important from my perspective are the following: Speed. […]

Continue reading →

PostgreSQL, R, US Census geography and encoding

Posted on March 20, 2014 by zev@zevross.com · Leave a comment

We use PostgreSQL/PostGIS to manage a lot of our tabular and geographic data from the US Census. In terms of workflow we will either download a shapefile manually from ftp://ftp.census.gov/geo/tiger/ or, if we’re dealing with more than one file (block groups or blocks for example), we will do this from within R (using the download.file() […]

Continue reading →

Geocoding With R’s ggmap Package

Posted on March 19, 2014 by zev@zevross.com · 1 Comment

One of the great pieces of the new ggmap package is the geocoding functionality. Other R functions can be used to geocode but they fail to provide detailed output like geocode accuracy which is often critical. You need to know if the lat/long in the output refers to a rooftop location or a city center, […]

Continue reading →

Working with the R package data.table

Posted on November 25, 2013 by zev@zevross.com · Leave a comment

For a recent project we have been working with a relatively large database of all historical air pollution data for New York City going all way back to the 1950s. Including both daily and hourly measurements the database includes about 10 million records and 20 variables. Using traditional R functions working with a database of […]

Continue reading →

Posts navigation

Newer posts →

We can help!

We specialize in data analytics, interactive maps, data visualization and Shiny applications.

Learn more about us

Search blog posts

Subscribe to new posts by email

Follow us on twitter

Follow @zevross
My Tweets

Categories

  • D3
  • Data Visualization
  • Database
  • GDAL
  • ggplot2
  • GIS/Maps
  • Graphic Design
  • JavaScript
  • LeafletJS
  • Markdown
  • PostGIS
  • PostgreSQL
  • Python
  • R
  • Regular Expressions
  • Shiny
  • Spatial
  • Uncategorized
  • Web Development
  • Web Mapping

Recent Comments

  • run3.pro on Easy multi-panel plots in R using facet_wrap() and facet_grid() from ggplot2
  • GS on Predictive modeling and machine learning in R with the caret package
  • Rodrigo on Using the R function anti_join to find unmatched records
  • Dev on Mapping in R using the ggplot2 package
  • fio on Map and analyze raster data in R
© 2025 Technical Tidbits From Spatial Analysis & Data Science
Powered by WordPress & Themegraphy
ZevRoss Spatial
Analysis
209 N. Aurora St, 2nd Floor
Ithaca, NY 14850
607-277-0004

info@zevross.com