Create (and execute) an SQL search query based on basic search terms and options. This allows you to search the database, without having to understand SQL.
Usage
search_ecotox(
search,
output_fields = list_ecotox_fields("default"),
group_by_results = TRUE,
compute = FALSE,
as_data_frame = TRUE,
...
)
search_ecotox_lazy(
search,
output_fields = list_ecotox_fields("default"),
compute = FALSE,
...
)
search_query_ecotox(search, output_fields = list_ecotox_fields("default"), ...)
Arguments
- search
A named
list
containing the search terms. The names of the elements should refer to the field (i.e. table header) in which the terms are searched. Uselist_ecotox_fields()
to obtain a list of available field names.Each element in that list should contain another list with at least one element named 'terms'. This should contain a
vector
ofcharacter
strings with search terms. Optionally, a second element named 'method' can be provided which should be set to either 'contains
' (default, when missing) or 'exact
'. In the first case the query will match any record in the indicated field that contains the search term. In case of 'exact
' it will only return exact matches. Note that searches are not case sensitive, but are picky with special (accented) characters. While building the local database (see build_ecotox_sqlite) such special characters may be treated differently on different operating systems. For the sake of reproducibility, the user is advised to stick with non-accented alpha-numeric characters.Search terms for a specific field (table header) will be combined with 'or'. Meaning that any record that matches any of the terms are returned. For instance when 'latin_name' 'Daphnia magna' and 'Skeletonema costatum' are searched, results for both species are returned. Search terms across fields (table headers) are combined with 'and', which will narrow the search. For instance if 'chemical_name' 'benzene' is searched in combination with 'latin_name' 'Daphnia magna', only tests where Daphnia magna are exposed to benzene are returned.
When this search behaviour described above is not desirable, the user can either adjust the query manually, or use this function to perform several separate searches and combine the results afterwards.
Beware that some field names are ambiguous and occur in multiple tables (like
cas_number' and
code'). When searching such fields, the search result may not be as expected.- output_fields
A
vector
ofcharacter
strings indicating which field names (table headers) should be included in the output. By default[list_ecotox_fields]("default")
is used. Use[list_ecotox_fields]("all")
to list all available fields.- group_by_results
Ecological test results are generally the most informative element in the ECOTOX database. Therefore, this search function returns a table with unique results in each row.
However, some tables in the database (such as 'chemical_carriers' and 'dose_responses') have a one to many relationship with test results. This means that multiple chemical carriers can be linked to a single test result, similarly, multiple doses can also be linked to a single test result.
By default the search results are grouped by test results. As a result not all doses or chemical carriers may be displayed in the output. Set the
group_by_results
parameter toFALSE
in order to force SQLite to output all data (e.g., all carriers). But beware that test results may be duplicated in those cases.- compute
The ECOTOXr package tries to construct database queries as lazy as possible. Meaning that R moves as much of the heavy lifting as possible to the database. When your search becomes complicated (e.g., when including many output fields), you may run into trouble and hit the SQL parser limits. In those cases you can set this parameter to
TRUE
. Database queries are then computed in the process of joining tables. This is generally slower. Alternatively, you could try to include less output fields in order to simplify the query.- as_data_frame
logical
value indicating whether the result should be converted into adata.frame
(default isTRUE
). When set toFALSE
the data will be returned as atbl_df()
.- ...
Arguments passed to
dbConnectEcotox()
and other functions. You can use this when the database is not located at the default path (get_ecotox_path()
).
Value
In case of search_query_ecotox
, a character
string containing an SQL
query is returned. This query is built based on the provided search terms and options.
In case of search_ecotox
a data.frame
is returned based on the search query built with
search_query_ecotox
. The data.frame
is unmodified as returned by SQLite, meaning that all
fields are returned as character
s (even where the field types are 'date' or 'numeric').
The results are tagged with: a time stamp; the package version used; and the file path of the SQLite database used in the search (when applicable). These tags are added as attributes to the output table or query.
Details
The ECOTOX database is stored locally as an SQLite file, which can be queried with SQL. These functions
allow you to automatically generate an SQL query and send it to the database, without having to understand
SQL. The function search_query_ecotox
generates and returns the SQL query (which can be edited by
hand if desired). You can also directly call search_ecotox
, this will first generate the query,
send it to the database and retrieve the result.
Although the generated query is not optimized for speed, it should be able to process most common searches
within an acceptable time. The time required for retrieving data from a search query depends on the complexity
of the query, the size of the query and the speed of your machine. Most queries should be completed within
seconds (or several minutes at most) on modern machines. If your search require optimisation for speed,
you could try reordering the search fields. You can also edit the query generated with search_query_ecotox
by hand and retrieve it with DBI::dbGetQuery()
.
Note that this package is actively maintained and this function may be revised in future versions.
In order to create reproducible results the user must: always work with an official release from
CRAN and document the package and database version that are used to generate specific results (see also
cite_ecotox()
).
See also
Other search-functions:
websearch_comptox()
,
websearch_ecotox()
Examples
if (FALSE) { # \dontrun{
## let's find the ids of all ecotox tests on species
## where Latin names contain either of 2 specific genus names and
## where they were exposed to the chemical benzene
if (check_ecotox_availability()) {
search <-
list(
latin_name = list(
terms = c("Skeletonema", "Daphnia"),
method = "contains"
),
chemical_name = list(
terms = "benzene",
method = "exact"
)
)
## rows in result each represent a unique test id from the database
result <- search_ecotox(search)
query <- search_query_ecotox(search)
cat(query)
} else {
print("Sorry, you need to use 'download_ecotox_data()' first in order for this to work.")
}
} # }