[unable to retrieve full-text content]More generally, in evaluating any data mining algorithm, if our test set is a subset of our training data the results will be optimistic and often overly optimistic. So that doesn’t seem like a great idea.

Original Post: Training Sets, Test Sets, and 10-fold Cross-validation

# Cross-validation

## How (and Why) to Create a Good Validation Set

[unable to retrieve full-text content]The definitions of training, validation, and test sets can be fairly nuanced, and the terms are sometimes inconsistently used. In the deep learning community, “test-time inference” is often used to refer to evaluating on data in production, which is not the technical definition of a test set.

Original Post: How (and Why) to Create a Good Validation Set

## Top KDnuggets tweets, Sep 06-12: Visualizing Cross-validation Code; Intro to #Blockchain and #BigData

[unable to retrieve full-text content]Also: WTF #Python – A collection of interesting and tricky Python examples; Thoughts after taking @AndrewYNg #Deeplearning #ai course; Another #Keras Tutorial For #NeuralNetwork Beginners.

Original Post: Top KDnuggets tweets, Sep 06-12: Visualizing Cross-validation Code; Intro to #Blockchain and #BigData

## Visualizing Cross-validation Code

[unable to retrieve full-text content]Cross-validation helps to improve your prediction using the K-Fold strategy. What is K-Fold you asked? Check out this post for a visualized explanation.

Original Post: Visualizing Cross-validation Code

## Understanding overfitting: an inaccurate meme in supervised learning

[unable to retrieve full-text content]Applying cross-validation prevents overfitting” is a popular meme, but is not actually true – it more of an urban legend. We examine what is true and how overfitting is different from overtraining.

Original Post: Understanding overfitting: an inaccurate meme in supervised learning

## Making Predictive Models Robust: Holdout vs Cross-Validation

[unable to retrieve full-text content]The validation step helps you find the best parameters for your predictive model and prevent overfitting. We examine pros and cons of two popular validation strategies: the hold-out strategy and k-fold.

Original Post: Making Predictive Models Robust: Holdout vs Cross-Validation

## Understanding the Bias-Variance Tradeoff: An Overview

Previous post Tweet Tags: Bias, Cross-validation, Model Performance, Variance A model’s ability to minimize bias and minimize variance are often thought of as 2 opposing ends of a spectrum. Being able to understand these two types of errors are critical to diagnosing model results. By Matthew Mayo, KDnuggets. A few years ago, Scott Fortmann-Roe wrote a great…

Original Post: Understanding the Bias-Variance Tradeoff: An Overview

## How to Compute the Statistical Significance of Two Classifiers Performance Difference

Previous post Next post Tweet Tags: Classifier, Cross-validation, Model Performance To determine whether a result is statistically significant, a researcher would have to calculate a p-value, which is the probability of observing an effect given that the null hypothesis is true. Here we are demonstrating how you can compute difference between two models using it. By Theophano…

Original Post: How to Compute the Statistical Significance of Two Classifiers Performance Difference