Skip to contents

Searching the local database

Obviously, searching the local database is only possible after the download and build is ready (see ?download_ecotox_data or ?build_ecotox_sqlite). Once built, it can be queried with ?search_ecotox.

Search the local database for tests of water flea Daphnia magna exposed to benzene

search_ecotox(
  list(
    latin_name    = list(terms = "Daphnia magna", method = "exact"),
    chemical_name = list(terms = "benzene",       method = "exact")
  )
)

Three ways of querying the local database

Let’s have a look at 3 different approaches for retrieving a specific record from the local database, using the unique identifier result_id. The first option is to use the build in search_ecotox function. It uses simple R syntax and allows you to search and collect any field from any table in the database. Furthermore, all requested output fields are automatically joined to the result without the end-user needing to know much about the database structure.

Using the prefab function search_ecotox packaged by ECOTOXr

search_ecotox(
  list(
    result_id = list(terms = "401386", method = "exact")
  ),
  as_data_frame = F
)

The approach shown above is provided for the user’s convenience. However, it is a relatively blunt instrument. It’s a hammer, which is fine when you have nails, but less effective when you have screws. In most cases it is more efficient to create a custom query for you specific needs. In those cases you can use two different strategies, involving either tidyverse verbs or the simple query language (SQL).

If you like to use dplyr verbs, you are in luck. SQLite database can be approached using dplyr verbs. This approach will only return information from the results table. The end-user will have to join other information (like test species and test substance) manually. This does require knowledge of the database structure (see vignette("ecotox-schema")).

Using dplyr verbs

con <- dbConnectEcotox()
dplyr::tbl(con, "results") |>
  dplyr::filter(result_id == "401386") |>
  dplyr::collect()

If you prefer working using SQL directly, that is fine too. The RSQLite package allows you to get queries using SQL statements. The result is identical to that of the previous approach. Here too the end-user needs knowledge of the database structure in order to join additional data. For more details see vignette("ecotox-schema"), it shows the database structure and provides clues on how to construct custom queries with dplyr.

Using SQL syntax

dbGetQuery(con, "SELECT * FROM results WHERE result_id='401386'") |>
  dplyr::as_tibble()

All three approaches shown above generate identical results. Although, the first has additional information joined automatically.

Searching the online database

You can also use the package to search using the online webform:

search_fields <-
  list_ecotox_web_fields(
    txAdvancedSpecEntries     = "daphnia magna",
    RBSPECSEARCHTYPE          = "EXACT",
    txAdvancedChemicalEntries = "benzene",
    RBCHEMSEARCHTYPE          = "EXACT")

search_results <- websearch_ecotox(search_fields, verify_ssl = FALSE)

search_results$`Aquatic-Export`
#> # A tibble: 56 × 87
#>    `CAS Number` `Chemical Name` `Chemical Grade` `Chemical Analysis`
#>           <dbl> <chr>           <chr>            <chr>              
#>  1        71432 Benzene         NA               NA                 
#>  2        71432 Benzene         NA               NA                 
#>  3        71432 Benzene         NA               Unmeasured         
#>  4        71432 Benzene         NA               Unmeasured         
#>  5        71432 Benzene         NA               Unmeasured         
#>  6        71432 Benzene         NA               Measured           
#>  7        71432 Benzene         NA               Unmeasured         
#>  8        71432 Benzene         NA               Unmeasured         
#>  9        71432 Benzene         NA               NA                 
#> 10        71432 Benzene         NA               Unmeasured         
#> # ℹ 46 more rows
#> # ℹ 83 more variables: `Chemical Purity Mean Op` <chr>,
#> #   `Chemical Purity Mean(%)` <dbl>, `Chemical Purity Min Op` <lgl>,
#> #   `Chemical Purity Min(%)` <lgl>, `Chemical Purity Max Op` <lgl>,
#> #   `Chemical Purity Max(%)` <lgl>, `Species Scientific Name` <chr>,
#> #   `Species Common Name` <chr>, `Species Group` <chr>,
#> #   `Organism Lifestage` <chr>, `Organism Age Mean Op` <chr>, …

When aiming for reproducibility it is better to query the local database, as the online tool may not be able to search and export all fields; data may change and is beyond your control; and it is processed on the server which forms a black box. See also vignette("reproducibility").