Downloads and parses a dataset from a Socrata open data portal URL, returning it as a tibble or sf
object.
Metadata is also returned as attributes on the returned object.
Usage
soc_read(url, query = soc_query(), alias = "label", page_size = 10000)
Arguments
- url
string; URL of the Socrata dataset.
- query
soc_query()
; Query parameters specification- alias
string; Use of field alias values. There are three options:
"label"
: field alias values are assigned as a label attribute for each field."replace"
: field alias values replace existing column names."drop"
: field alias values replace existing column names.
- page_size
whole number; Maximum number of rows returned per request.
Value
A tibble with additional attributes containing dataset metadata.
If the dataset contains a single non-nested geospatial field, it will be returned as an sf
object.
The returned object has the following attributes:
- id
Asset identifier (four-by-four ID).
- name
Asset name.
- attribution
Attribution or publisher of the asset.
- owner_name
Display name of the asset owner.
- provenance
Provenance of asset (official or community).
- description
Textual description of the asset.
- created
Date asset was created.
- data_last_updated
Date asset data was last updated
- metadata_last_updated
Date asset metadata was last updated
- domain_category
Category label assigned by the domain.
- domain_tags
Tags applied by the domain.
- domain_metadata
Metadata associated with the asset assigned by the domain.
- columns
A dataframe with the following columns:
- column_name
Names of asset columns.
- column_label
Labels of asset columns.
- column_datatype
Datatypes of asset columns.
- column_description
Description of asset columns.
- permalink
Permanent URL where the asset can be accessed.
- link
Direct asset link.
- license
License associated with the asset.
Examples
# \donttest{
soc_read(
"https://soda.demo.socrata.com/dataset/USGS-Earthquakes-2012-11-08/3wfw-mdbc/"
)
#> # A tibble: 1,935 × 10
#> source earthquake_id version datetime magnitude depth
#> <chr> <chr> <chr> <dttm> <dbl> <dbl>
#> 1 uu 09101857 2 2012-09-10 18:57:30 2.2 0.9
#> 2 uu 09081656 2 2012-09-08 16:56:06 2.7 1.2
#> 3 ak 10555601 2 2012-09-10 13:16:13 1.1 11.6
#> 4 nc 71839715 2 2012-09-08 17:12:28 1 2.2
#> 5 uu 09140856 2 2012-09-14 08:56:35 2.3 0.1
#> 6 ak 10503556 2 2012-07-01 02:01:00 2.5 1
#> 7 hv 60365026 1 2012-06-28 12:06:00 1.7 8.6
#> 8 ci 11131234 0 2012-07-03 08:12:00 1.1 10.2
#> 9 hv 60365151 1 2012-06-29 00:26:00 1.8 11.8
#> 10 ak 10502928 2 2012-06-29 17:12:00 2 96.8
#> # ℹ 1,925 more rows
#> # ℹ 4 more variables: number_of_stations <dbl>, region <chr>,
#> # location <tibble[,5]>, `:@computed_region_k83t_ady5` <dbl>
soc_read(
"https://soda.demo.socrata.com/dataset/USGS-Earthquakes-2012-11-08/3wfw-mdbc/",
soc_query(
select = "region, avg(magnitude) as avg_magnitude, count(*) as count",
group_by = "region",
having = "count >= 5",
order_by = "avg_magnitude DESC"
)
)
#> # A tibble: 40 × 3
#> region avg_magnitude count
#> <chr> <dbl> <dbl>
#> 1 Kuril Islands 5.32 5
#> 2 northern Xinjiang, China 4.87 6
#> 3 south of Java, Indonesia 4.86 9
#> 4 Tonga 4.85 6
#> 5 near the east coast of Honshu, Japan 4.72 10
#> 6 Costa Rica 4.66 5
#> 7 Dominican Republic region 3.25 6
#> 8 Virgin Islands region 3.11 47
#> 9 Puerto Rico region 2.92 18
#> 10 Rat Islands, Aleutian Islands, Alaska 2.77 21
#> # ℹ 30 more rows
# }