Skip to contents

Downloads and parses a dataset from a Socrata open data portal URL, returning it as a tibble or sf object. Metadata is also returned as attributes on the returned object.

Usage

soc_read(url, query = soc_query(), alias = "label", page_size = 10000)

Arguments

url

string; URL of the Socrata dataset.

query

soc_query(); Query parameters specification

alias

string; Use of field alias values. There are three options:

  • "label": field alias values are assigned as a label attribute for each field.

  • "replace": field alias values replace existing column names.

  • "drop": field alias values replace existing column names.

page_size

whole number; Maximum number of rows returned per request.

Value

A tibble with additional attributes containing dataset metadata. If the dataset contains a single non-nested geospatial field, it will be returned as an sf object.

The returned object has the following attributes:

id

Asset identifier (four-by-four ID).

name

Asset name.

attribution

Attribution or publisher of the asset.

owner_name

Display name of the asset owner.

provenance

Provenance of asset (official or community).

description

Textual description of the asset.

created

Date asset was created.

data_last_updated

Date asset data was last updated

metadata_last_updated

Date asset metadata was last updated

domain_category

Category label assigned by the domain.

domain_tags

Tags applied by the domain.

domain_metadata

Metadata associated with the asset assigned by the domain.

columns

A dataframe with the following columns:

column_name

Names of asset columns.

column_label

Labels of asset columns.

column_datatype

Datatypes of asset columns.

column_description

Description of asset columns.

permalink

Permanent URL where the asset can be accessed.

link

Direct asset link.

license

License associated with the asset.

Examples

# \donttest{
soc_read(
  "https://soda.demo.socrata.com/dataset/USGS-Earthquakes-2012-11-08/3wfw-mdbc/"
)
#> # A tibble: 1,935 × 10
#>    source earthquake_id version datetime            magnitude depth
#>    <chr>  <chr>         <chr>   <dttm>                  <dbl> <dbl>
#>  1 uu     09101857      2       2012-09-10 18:57:30       2.2   0.9
#>  2 uu     09081656      2       2012-09-08 16:56:06       2.7   1.2
#>  3 ak     10555601      2       2012-09-10 13:16:13       1.1  11.6
#>  4 nc     71839715      2       2012-09-08 17:12:28       1     2.2
#>  5 uu     09140856      2       2012-09-14 08:56:35       2.3   0.1
#>  6 ak     10503556      2       2012-07-01 02:01:00       2.5   1  
#>  7 hv     60365026      1       2012-06-28 12:06:00       1.7   8.6
#>  8 ci     11131234      0       2012-07-03 08:12:00       1.1  10.2
#>  9 hv     60365151      1       2012-06-29 00:26:00       1.8  11.8
#> 10 ak     10502928      2       2012-06-29 17:12:00       2    96.8
#> # ℹ 1,925 more rows
#> # ℹ 4 more variables: number_of_stations <dbl>, region <chr>,
#> #   location <tibble[,5]>, `:@computed_region_k83t_ady5` <dbl>

soc_read(
  "https://soda.demo.socrata.com/dataset/USGS-Earthquakes-2012-11-08/3wfw-mdbc/",
  soc_query(
    select = "region, avg(magnitude) as avg_magnitude, count(*) as count",
    group_by = "region",
    having = "count >= 5",
    order_by = "avg_magnitude DESC"
  )
)
#> # A tibble: 40 × 3
#>    region                                avg_magnitude count
#>    <chr>                                         <dbl> <dbl>
#>  1 Kuril Islands                                  5.32     5
#>  2 northern Xinjiang, China                       4.87     6
#>  3 south of Java, Indonesia                       4.86     9
#>  4 Tonga                                          4.85     6
#>  5 near the east coast of Honshu, Japan           4.72    10
#>  6 Costa Rica                                     4.66     5
#>  7 Dominican Republic region                      3.25     6
#>  8 Virgin Islands region                          3.11    47
#>  9 Puerto Rico region                             2.92    18
#> 10 Rat Islands, Aleutian Islands, Alaska          2.77    21
#> # ℹ 30 more rows
# }