đ What is min_child_weight in XGBoost?
min_child_weight is not intuitive. But who cares...xgboost still rox.

Great SO question about âExplanation of min_child_weight in xgboost algorithmâ. Because when you read the docs you expect to hear that itâs the number of samples in the leaf. But instead it talks about âinstance weightâ. From the answer by hahdawg we learn the following:
For regression, min_child_weight is the number of samples in a leaf. The default value for this is 1, so for regression tasks you may have an overfit tree thatâs essentially splitting down to one datapoint. This is because the sum of the instance weight for regression is the number of samples.
For non-regression, itâs not. Itâs some measure of the purity of the node.
This question asks, âwhy not just use min_child_leafâ, where min_child_leaf would be the number of samples. It seems a lot more intuitive and you wouldnât have to do the mental math for non-regression tasks.
From the XGBoost docs
min_child_weight [default=1] . Minimum sum of instance weight (hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression task, this simply corresponds to minimum number of instances needed to be in each node. The larger min_child_weight is, the more conservative the algorithm will be. (source)
From the XGBoost code:
bool IsValid(GradStats const &left, GradStats const &right) const {
return left.GetHess() >= param_.min_child_weight && right.GetHess() >= param_.min_child_weight;
}
So we can see that if either sideâs hessian goes below the hessian, it stops splitting.
QED.