social.coop is one of the many independent Mastodon servers you can use to participate in the fediverse.
A Fediverse instance for people interested in cooperative and collective projects. If you are interested in joining our community, please apply at https://join.social.coop/registration-form.html.

Administered by:

Server stats:

478
active users

#crossvalidation

0 posts0 participants0 posts today
Johann Joets<p><a href="https://www.r-bloggers.com/2025/04/setting-up-cross-validation-caret-package-in-r-a-step-by-step-guide/?utm_source=phpList&amp;utm_medium=email&amp;utm_campaign=R-bloggers-daily&amp;utm_content=HTML" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">r-bloggers.com/2025/04/setting</span><span class="invisible">-up-cross-validation-caret-package-in-r-a-step-by-step-guide/?utm_source=phpList&amp;utm_medium=email&amp;utm_campaign=R-bloggers-daily&amp;utm_content=HTML</span></a><br><a href="https://mastodon.social/tags/rstats" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>rstats</span></a> <a href="https://mastodon.social/tags/crossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crossValidation</span></a> <a href="https://mastodon.social/tags/modeling" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>modeling</span></a></p>
neuralrow<p>New article out folks<br>Check it out<br><a href="https://medium.com/@neuralrow/cross-validation-1df416898697" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">medium.com/@neuralrow/cross-va</span><span class="invisible">lidation-1df416898697</span></a></p><p><a href="https://mastodon.social/tags/DataScience" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataScience</span></a> <a href="https://mastodon.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MachineLearning</span></a> <a href="https://mastodon.social/tags/CrossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CrossValidation</span></a> <a href="https://mastodon.social/tags/LinearRegression" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LinearRegression</span></a> <a href="https://mastodon.social/tags/DeepLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DeepLearning</span></a> <a href="https://mastodon.social/tags/NeuralNetworks" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>NeuralNetworks</span></a></p>
nf-core<p>Pipeline release! nf-core/drugresponseeval v1.0.0 - 1.0.0!</p><p>Please see the changelog: <a href="https://github.com/nf-core/drugresponseeval/releases/tag/1.0.0" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/nf-core/drugrespons</span><span class="invisible">eeval/releases/tag/1.0.0</span></a></p><p><a href="https://mstdn.science/tags/celllines" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>celllines</span></a> <a href="https://mstdn.science/tags/crossvalidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crossvalidation</span></a> <a href="https://mstdn.science/tags/deeplearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>deeplearning</span></a> <a href="https://mstdn.science/tags/drugresponse" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>drugresponse</span></a> <a href="https://mstdn.science/tags/drugresponseprediction" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>drugresponseprediction</span></a> <a href="https://mstdn.science/tags/drugs" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>drugs</span></a> <a href="https://mstdn.science/tags/fairprinciples" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>fairprinciples</span></a> <a href="https://mstdn.science/tags/generalization" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>generalization</span></a> <a href="https://mstdn.science/tags/hyperparametertuning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>hyperparametertuning</span></a> <a href="https://mstdn.science/tags/machinelearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machinelearning</span></a> <a href="https://mstdn.science/tags/randomizationtests" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>randomizationtests</span></a> <a href="https://mstdn.science/tags/robustnessassessment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robustnessassessment</span></a> <a href="https://mstdn.science/tags/training" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>training</span></a> <a href="https://mstdn.science/tags/nfcore" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>nfcore</span></a> <a href="https://mstdn.science/tags/openscience" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>openscience</span></a> <a href="https://mstdn.science/tags/nextflow" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>nextflow</span></a> <a href="https://mstdn.science/tags/bioinformatics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>bioinformatics</span></a></p>
Chloé Azencott<p>⬆️ </p><p>6) thankfully, Wager (2020) <a href="https://doi.org/10.1080/01621459.2020.1727235" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">doi.org/10.1080/01621459.2020.</span><span class="invisible">1727235</span></a> shows that cross-validation is asymptotically consistant for model selection, so while what we're doing gives us poor estimates of generalization error and bad error bars, at least it's valid for model selection.</p><p><a href="https://lipn.info/tags/machineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machineLearning</span></a> <a href="https://lipn.info/tags/statistics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>statistics</span></a> <a href="https://lipn.info/tags/crossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crossValidation</span></a></p>
Chloé Azencott<p>⬆️ </p><p>5) Bates et al. (2023) <a href="https://doi.org/10.1080/01621459.2023.2197686" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">doi.org/10.1080/01621459.2023.</span><span class="invisible">2197686</span></a> propose a nested cross-validation estimator of generalization error that's unbiased and has an unbiased mean squared error estimator. It's computationally quite intensive. I played a bit with it, and my in high-dimensional set ups (large p small n) I got error bars that had indeed good coverage of the generalization error, but were also covering most of the [0, 1] interval, which is less helpful.</p><p>⬇️</p><p><a href="https://lipn.info/tags/machineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machineLearning</span></a> <a href="https://lipn.info/tags/statistics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>statistics</span></a> <a href="https://lipn.info/tags/crossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crossValidation</span></a></p>
Chloé Azencott<p>⬆️ </p><p>4) in any case, error bars are wrong, because it's impossible to get an unbiased estimator of the mean squared error of an estimator that's based on a single fold of cross-validation, as shown by Bengio &amp; Grandvalet (2004) <a href="https://dl.acm.org/doi/10.5555/1005332.1044695" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">dl.acm.org/doi/10.5555/1005332</span><span class="invisible">.1044695</span></a></p><p>⬇️ </p><p><a href="https://lipn.info/tags/machineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machineLearning</span></a> <a href="https://lipn.info/tags/statistics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>statistics</span></a> <a href="https://lipn.info/tags/crossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crossValidation</span></a></p>
Chloé Azencott<p>⬆️ </p><p>3) cross-validation estimators are better estimators of *expected test error* (across all possible training sets) than of *generalization error* of a model. </p><p>This has been known for a while and even appears in The Elements of Statistical Learning, so I should have known about this much earlier. Bates et al. (2023) <a href="https://doi.org/10.1080/01621459.2023.2197686" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">doi.org/10.1080/01621459.2023.</span><span class="invisible">2197686</span></a> show why this is for linear models.</p><p>⬇️ </p><p><a href="https://lipn.info/tags/machineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machineLearning</span></a> <a href="https://lipn.info/tags/statistics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>statistics</span></a> <a href="https://lipn.info/tags/crossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crossValidation</span></a></p>
Chloé Azencott<p>⬆️ </p><p>2) (not a surprise, but worth remembering): cross-validation error bars can be very large when sample sizes are small (unsurprisingly, due to the \( \frac{1}{\sqrt{n}} \) factor). </p><p>This is discussed for example regarding microarray studies in Braga-Neto &amp; Dougherty (2004) <a href="https://doi.org/10.1093/bioinformatics/btg419" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">doi.org/10.1093/bioinformatics</span><span class="invisible">/btg419</span></a> and <span class="h-card" translate="no"><a href="https://mastodon.social/@GaelVaroquaux" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>GaelVaroquaux</span></a></span> (2018) regarding brain image analysis <a href="https://doi.org/10.1016/j.neuroimage.2017.06.061" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">doi.org/10.1016/j.neuroimage.2</span><span class="invisible">017.06.061</span></a></p><p>⬇️ </p><p><a href="https://lipn.info/tags/machineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machineLearning</span></a> <a href="https://lipn.info/tags/statistics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>statistics</span></a> <a href="https://lipn.info/tags/crossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crossValidation</span></a></p>
Chloé Azencott<p>⬆️ </p><p>Reading the discussion of the paper by other statisticians is enlightening as to how the tone of scientific discourse has mercifully changed in 50 years.</p><p>Also, "The term 'assessment' is preferred to 'validation' which has a ring of excessive confidence about it."</p><p>⬇️</p><p><a href="https://lipn.info/tags/machineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machineLearning</span></a> <a href="https://lipn.info/tags/statistics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>statistics</span></a> <a href="https://lipn.info/tags/crossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crossValidation</span></a></p>
Chloé Azencott<p>We were discussing cross-validation estimates of model performance recently with colleagues, and I dug a bit in the literature to better understand where we're at.</p><p>This is not my topic of expertise, but here are a few tidbits I'd like to share.</p><p>1) cross-validation has been the topic of much discussion for many decades. Stone (1974) <a href="https://www.jstor.org/stable/2984809" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="">jstor.org/stable/2984809</span><span class="invisible"></span></a> gives a good overview of what precedes. ­</p><p>⬇️ </p><p><a href="https://lipn.info/tags/machineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machineLearning</span></a> <a href="https://lipn.info/tags/statistics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>statistics</span></a> <a href="https://lipn.info/tags/crossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crossValidation</span></a></p>
Nathan Pavlovic<p>Enjoying the discussion of cross-validation methods for use of sensor data for air quality applications at the EPA air sensor QA workshop. It’s easy to overestimate how well you are doing with sensor data corrections or fusion applications unless a rigorous independent test approach is used <a href="https://fediscience.org/tags/airquality" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>airquality</span></a> <a href="https://fediscience.org/tags/airpollution" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>airpollution</span></a> <a href="https://fediscience.org/tags/crossvalidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crossvalidation</span></a> <a href="https://fediscience.org/tags/lowcostsensors" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>lowcostsensors</span></a> <span class="h-card"><a href="https://airpollution.science/@dwestervelt" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>dwestervelt</span></a></span> <a href="https://www.epa.gov/amtic/2023-air-sensors-quality-assurance-workshop" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">epa.gov/amtic/2023-air-sensors</span><span class="invisible">-quality-assurance-workshop</span></a></p>
Daniele de Rigo<p>3/<br><a href="https://hostux.social/tags/Feynman" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Feynman</span></a>: "it doesn’t make any sense to calculate after the event. You see, you found the peculiarity, and so you selected the peculiar case"<br><a href="https://archive.org/details/meaningofitallth0000feyn_d8d3/page/80/mode/2up?q=%22it+doesn%E2%80%99t+make+any+sense+to+calculate+after+the+event%22&amp;view=theater" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="ellipsis">archive.org/details/meaningofi</span><span class="invisible">tallth0000feyn_d8d3/page/80/mode/2up?q=%22it+doesn%E2%80%99t+make+any+sense+to+calculate+after+the+event%22&amp;view=theater</span></a></p><p>Special trending case: <a href="https://hostux.social/tags/CrossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CrossValidation</span></a> (where data for selecting/tuning a model are also used to test it, with allegedly "clever" methods to avoid fooling oneself) and other <a href="https://hostux.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MachineLearning</span></a> math. tricks where many dimensions/parameters are tuned by using much less data</p><p>Without a deep understanding, black-box tools lead astray</p>
Tiago F. R. Ribeiro<p>Model Evaluation, Model Selection, and Algorithm<br>Selection in Machine Learning</p><p><a href="https://mastodon.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MachineLearning</span></a> <a href="https://mastodon.social/tags/ModelEvaluation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ModelEvaluation</span></a> <a href="https://mastodon.social/tags/CrossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CrossValidation</span></a> <br><a href="https://mastodon.social/tags/HyperparameterOptimization" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HyperparameterOptimization</span></a> </p><p><a href="https://arxiv.org/pdf/1811.12808.pdf" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/pdf/1811.12808.pdf</span><span class="invisible"></span></a></p>
Aki Vehtari<p>New paper "Cross-validatory model selection for Bayesian autoregressions with exogenous regressors" with Alex Cooper, <span class="h-card"><a href="https://fosstodon.org/@dan_p_simpson" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>dan_p_simpson</span></a></span>, Lauren Kennedy, and Catherine Forbes</p><p>One FAQ is "Can you use LOO or cross-validation in general for time series?" The short answer is "Yes", and I've had a longer answer in CV-FAQ <a href="https://avehtari.github.io/modelselection/CV-FAQ.html#9_Can_cross-validation_be_used_for_time_series" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="ellipsis">avehtari.github.io/modelselect</span><span class="invisible">ion/CV-FAQ.html#9_Can_cross-validation_be_used_for_time_series</span></a></p><p>Now we have a better answer on what kind of cross-validation is good with timeseries!</p><p><a href="https://bayes.club/tags/PaperThread" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>PaperThread</span></a> <a href="https://bayes.club/tags/Bayesian" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Bayesian</span></a> <a href="https://bayes.club/tags/CrossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CrossValidation</span></a></p>
Clara Burgard<p>We also looked at the influence of the average domain used for the input properties and we conducted a <a href="https://mastodon.green/tags/CrossValidation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CrossValidation</span></a> to assess how the parameterisations perform on time steps and ice shelves they have not seen during <a href="https://mastodon.green/tags/tuning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>tuning</span></a>.</p>