Data
Whenever possible, I tend to store data as JSON, including within postgres, since it is has at time of writing become the most common way to share information via websites.
JSON was intended to allow programs written in different languages to communicate effectively.
The minimal principle was critically important. Standards should be simple and complete. The less we have to agree to, the more easily we can interoperate. — How JSON works
Whereas schema.org used to use plural identififiers, eg performers to show the expected value was an array/list, this has been deprecated for singular names, eg performer. In practice this means the renderer needs to check if the value is an array, object, or string, something that has tripped me up fairly often writing Hugo templates for data read from JSON-LD.
Arrays
https://gohugo.io/functions/reflect/isslice/
Objects
https://gohugo.io/functions/reflect/ismap/
Strings
Date
visualisation
R
High-level graphics functions initiate a new plot
curve
svg("curve_sin.svg", width = 11, pointsize = 12, family = "sans")
curve(sin, from = 0, to = 2 * pi, n = 101)
abline(h = 0)
dev.off()
https://www.kaggle.com/learn/intro-to-machine-learning
https://datacarpentry.github.io/R-ecology-lesson/
https://www.stat.cmu.edu/~ryantibs/statcomp/lectures/apply.html
R
- subset
- split
- reorder
- aggregate
- unique
- normalize
- scale
models
All of these take a formula as their first argument, a data frame as their second, subset as their third, weights as their fourth, na.action,…
- lm
- glm
- aov
- loess
- model.matrix
- randomForest (needs to be loaded with library(randomForest))
- lda (needs library(lda))
- naive_bayes (needs library(naivebayes))
- tree (needs to be loaded with library(tree))
- pcr (needs to be loaded with library(pcr)
- plsr
- gam
- gbm
- svm
- survfit
- survdiff
- coxph
- qda
- regsubsets
formula
~ operator
formula = y ~ x1 + x2 + x3
metadata
Schema.org | |
---|---|
Article | NewsArticle |
Breadcrumb | BreadcrumbList |
Carousel | ItemList |
Dataset | Dataset |
Event | Event |
Local business | LocalBusiness |
Organization | Organization |
Glossary
Formats
Google intro to structured data
I’ve opted to use both JSON-LD and Microdata, and ignore RDFa which seems to have fallen out of fashion with XML.
The reason I’m using Microdata is I’ve found it an aid in keeping my HTML templates logical and neat, though it does sometimes confuse Google’s scrapers in that two different Metadata formats on the same page invites contradictions. For me, Microdata is in lieu of cluttering the HTML code with classes. I get the CSS to reference elements by itemprop rather than class.