This additional backtest will hopefully help someone in the future. I ran data.current(), data.history(frequency='1m'), data.history(frequency='1d') and pipeline before and after trading starts.
Case 1: Before trading starts
Calling data.current(), data.history(frequency='1m'), data.history(frequency='1d), and pipeline_output in before_trading_starts() returns price/close as expected. (e.g. close=close, price=price). Volume for '1m' = data.current() and is the last tick's volume. Volume for '1d' is the sum of the previous trading day's volume. For me this is intuitive.
Case 2: At initial market open - intuitive
data.current() = data.history(frequency='1m')[-1] and it the current price (forward filled) and previous close. All other data.history bars are previous ticks.
Case 3: Market open plus 30 minutes - still intuitive
data.current() = data.history(frequency='1m') and is the previous tick's OHLCV and the forward filled price
Case 4: At market open - The beware of case (daily history)
When you call data.history(frequency='1d') at market open you get the previous bar's 1m close value along with the forward filled price. So at the very moment the market opens data.history(frequency='1m') = data.history(frequency='1d') = data.current()
As time progresses data.history(frequency='1d') diverges and high marks the highest bar value prior to the current bar, low marks the lowest bar value, and volume is aggregated up to that point.
Q's help explicitly states that "price" is always forward-filled. The other fields ("open", "high", "low", "close", "volume") are never forward-filled.
For me this is a type of forward-filling but I do agree it can be helpful. You can probably glean this out of the doc's from a careful reading of how they handle the examples for volume but since most of the examples are forward filled price it is not obvious that tick data on a frequency of daily is returned for close/open while the high/low gets the max/min of all previous bars up to that point.
Perhaps a better statement in help would be:
data.history() with frequency=daily doesn't honor the never forward-filled contract for OHLCV and instead returns the close/open of the previous minute bar, the max/min as high/low of all previous bars, and aggregated volume for all bars within the current trading day.
I don't think it is correct to say OHLCV is NOT forward-filled when frequency='1d'.
Case 5: At market open - Pipeline call - also beware of
Now we come to the reason I thought this was a bug. If you look at the description for the US Equity Pricing database it says it contains minute data for OHLCV (not price). But still, minute level data.
So, what do you think happens if you call pipeline data for USEquityPricing during the trading day? If you guessed it would behave like data.history() you're wrong (or in my case I thought data.history should behave like pipeline data).
In fact when you call pipeline at market open you get the previous day's close and the previous day's total volume and not tick data. (e.g. this pipeline call exactly matches the before_trading_starts pipeline call). For me this is the definition of not forward filled.
Why is this important?
This introduced a very subtle issue for me where two dataframes with exactly the same time-stamped index had different values for "close". With everything trading on "daily" frequency this was very hard to track down.
USEquityPricing always returns end of previous day data no matter what time it is called from pipeline (at least as far as I can tell).
data.history(frequency='1d') is previous tick's value for O/C for the current trading day, the max/min of all previous bars for H/L within the current trading day, and the aggregated value for Volume for the current trading day.
This backtest contains the log for apple called 30 minutes after trading starts. It is also helpful to remove minutes=30 and compare the results.