This just came to light relatively recently: the latin hypercube version of the B1 metric in environment space is probably not trustworthy as currently implemented. Due to the combination of standardizing the distribution and the use of logs in the calculation, there's a dependence on sample size that makes the metric fail to converge. For an illustration, here's B2 as a function of sample size:
That's behaving as you'd like it to - seems to be converging on a relatively stable value, not changing much with additional sampling (note the scale of the Y axis).
Now look at B1:
There's an obvious trend here with increasing sample size, and the scale of the Y axis is such that those differences could be quite significant.
At some future date we may figure out how to adjust for this, but for now I'd say just avoid using B1 in environment space altogether.
That's behaving as you'd like it to - seems to be converging on a relatively stable value, not changing much with additional sampling (note the scale of the Y axis).
Now look at B1:
There's an obvious trend here with increasing sample size, and the scale of the Y axis is such that those differences could be quite significant.
At some future date we may figure out how to adjust for this, but for now I'd say just avoid using B1 in environment space altogether.