Package: arete 0.1

arete: Automated REtrieval from TExt

A Python based pipeline for extraction of species occurrence data through the usage of large language models. Includes validation tools designed to handle model hallucinations for a scientific, rigorous use of LLM. Currently supports usage of GPT with more planned, including local and non-proprietary models. For more details on the methodology used please consult the references listed under each function, such as Kent, A. et al. (1995) <doi:10.1002/asi.5090060209>, van Rijsbergen, C.J. (1979, ISBN:978-0408709293, Levenshtein, V.I. (1966) <https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf> and Klaus Krippendorff (2011) <https://repository.upenn.edu/handle/20.500.14332/2089>.

Authors:Vasco V. Branco [cre, aut], Vaughn Shirey [ctb], Thomas Merrien [ctb], Pedro Cardoso [aut]

arete_0.1.tar.gz
arete_0.1.zip(r-4.7)arete_0.1.zip(r-4.6)arete_0.1.zip(r-4.5)
arete_0.1.tgz(r-4.6-any)arete_0.1.tgz(r-4.5-any)
arete_0.1.tar.gz(r-4.7-any)arete_0.1.tar.gz(r-4.6-any)
arete_0.1.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
arete/json (API)

# Install 'arete' in R:
install.packages('arete', repos = c('https://vascobranco.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/vascobranco/arete/issues

On CRAN:

Conda:

ecologylarge-language-modelswildlife-conservation

3.70 score 1 stars 3 scripts 217 downloads 19 exports 195 dependencies

Last updated from:cc738ca101. Checks:7 NOTE, 2 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64NOTE422
source / vignettesOK321
linux-release-x86_64NOTE394
macos-release-arm64NOTE677
macos-oldrel-arm64NOTE458
windows-develNOTE766
windows-releaseNOTE820
windows-oldrelNOTE800
wasm-releaseOK199

Exports:arete_dataarete_setupaux_string_to_coordscheck_langcompare_IUCNcreate_training_datafile_comparisongazetteerget_geodatainstall_OCR_packagesinstall_python_packageslabelslabels_uniqueperformance_reportprocess_documentprocess_species_namesstring_to_coordswebanno_openwebanno_summary

Dependencies:abindapeaskpassbase64encBATBHbiomod2bitbit64bootbslibcachemcaretclassclassIntcld2cliclockclusterclusterGenerationcodacodetoolscombinatcpp11crayoncurldata.tableDBIDEoptimdiagramdigestdismodoParalleldplyre1071evaluateexpmfarverfastclusterfastmapfastmatchfedmatchFNNfontawesomeforcatsforeachfsfuturefuture.applygarglegbmgdistancegeckogenericsgeometrygeosphereggplot2globalsgluegoogledrivegowergtablehardhatherehighrhitandrunhmshtmltoolshttrhypervolumeigraphipredirrisobanditeratorsjquerylibjsonlitekableExtrakernlabKernSmoothknitrkslabelinglatticelavalifecyclelinproglistenvlpSolvelubridatemagicmagrittrmapsMASSMatrixmclustmemoisemgcvmimemnormtModelMetricsmulticoolmvtnormnlmenls2nnetnumDerivopenssloptimParallelpalmerpenguinsparallellypbapplypdftoolspdistpermutephangornphytoolspillarpkgconfigPlotToolsplyrpngpracmapredictsPresenceAbsenceprettyunitspROCprodlimprogressprogressrprotoproxypurrrqpdfquadprogR6rappdirsrasterrbibutilsrcddRColorBrewerRcppRcppArmadilloRcppProgressRcppTOMLRdpackrecipesredreshapereshape2reticulaterlangrmarkdownrpartrprojrootrstudioapis2S7sassscalesscatterplot3dsfshapeSnowballCspsparsevctrsSQUAREMstringdiststringistringrsurvivalsvglitesyssystemfontsterratextshapingtibbletidyrtidyselecttimechangetimeDatetinytexTreeToolstzdbunitsutf8uuidvctrsveganviridisLitewithrwkxfunxml2yaml

Package workflow

Rendered fromrequest_example.Rmdusingknitr::rmarkdownon May 06 2026.

Last update: 2025-11-06
Started: 2025-11-06