This paper describes an investigation of machine learning for supervisory control of active and passive thermal storage capacity in buildings. Previous studies show that the utilization of active or passive thermal storage, or both, can yield significant peak cooling load reduction and associated electrical demand and operational cost savings. In this study, a model-free learning control is investigated for the operation of electrically driven chilled water systems in heavy-mass commercial buildings. The reinforcement learning controller learns to operate the building and cooling plant based on the reinforcement feedback (monetary cost of each action, in this study) it receives for past control actions. The learning agent interacts with its environment by commanding the global zone temperature setpoints and thermal energy storage charging∕discharging rate. The controller extracts information about the environment based solely on the reinforcement signal; the controller does not contain a predictive or system model. Over time and by exploring the environment, the reinforcement learning controller establishes a statistical summary of plant operation, which is continuously updated as operation continues. The present analysis shows that learning control is a feasible methodology to find a near-optimal control strategy for exploiting the active and passive building thermal storage capacity, and also shows that the learning performance is affected by the dimensionality of the action and state space, the learning rate and several other factors. It is found that it takes a long time to learn control strategies for tasks associated with large state and action spaces.

1.
Conniff
,
J.
, 1991, “
Strategies for Reducing Peak Air-Conditioning Loads by Using Heat Storage in the Building Structure
,”
ASHRAE Trans.
0001-2505,
97
(
1
), pp.
704
709
.
2.
Morris
,
F. B.
,
Braun
,
J.
, and
Treado
,
S.
, 1994, “
Experimental and Simulated Performance of Optimal Control of Building Thermal Storage
,”
ASHRAE Trans.
0001-2505,
100
(
1
), pp.
402
414
.
3.
Rabl
,
A.
, and
Norford
,
L.
, 1991, “
Peak Load Reduction by Preconditioning Buildings at Night
,”
Int. J. Energy Res.
0363-907X,
15
, pp.
781
798
.
4.
Keeney
,
K.
, and
Braun
,
J.
, 1996, “
A Simplified Method for Determining Optimal Cooling Control Strategies for Thermal Storage in Building Mass
,”
HVAC&R Res.
1078-9669,
2
(
1
), pp.
59
78
.
5.
Braun
,
J.
, 1990, “
Reducing Energy Costs and Peak Electrical Demand through Optimal Control of Building Thermal Mass
,”
ASHRAE Trans.
0001-2505,
96
(
2
), pp.
876
888
.
6.
Andresen
,
I.
, and
Brandemuehl
,
M.
, 1998, “
Heat Storage in Building Thermal Mass: A Parametric Study
,”
ASHRAE Trans.
0001-2505,
98
(
1
), pp.
910
918
.
7.
Braun
,
J.
,
Montgomery
,
K.
, and
Chaturvedi
,
N.
, 2001, “
Evaluating the Performance of Building Thermal Mass Control Strategies
,”
HVAC&R Res.
1078-9669,
7
, pp.
403
428
.
8.
Braun
,
J.
, 2003, “
Load Control Using Building Thermal Mass
,”
J. Sol. Energy Eng.
0199-6231,
125
(
3
), pp.
292
301
.
9.
Henze
,
G.
,
Krarti
,
M.
, and
Brandemuehl
,
M.
, 1997, “
A Simulation Environment for the Analysis of Ice Storage Controls
,”
HVAC&R Res.
1078-9669,
3
(
2
), pp.
128
148
.
10.
Henze
,
G.
,
Krarti
,
M.
, and
Brandemuehl
,
M.
, 2002, “
Guidelines for Improved Performance of Ice Storage Systems
,”
Energy Build.
0378-7788,
35
(
2
), pp.
111
127
.
11.
Henze
,
G.
,
Dodier
,
R.
, and
Krarti
,
M.
, 1997, “
Development of a Predictive Optimal Controller for Thermal Energy Storage Systems
,”
HVAC&R Res.
1078-9669,
3
(
3
), pp.
233
264
.
12.
Henze
,
G.
, and
Krarti
,
M.
, 1999, “
The Impact of Forecasting Uncertainty on the Performance of a Predictive Optimal Controller for Thermal Energy Storage Systems
,”
ASHRAE Trans.
0001-2505,
105
(
2
), pp.
553
561
.
13.
Kintner-Meyer
,
M.
, and
Emery
,
A.
, 1995, “
Optimal Control of an HVAC System Using Cold Storage and Building Thermal Capacitance
,”
Energy Build.
0378-7788,
23
(
3
), pp.
19
31
.
14.
Henze
,
G.
,
Felsmann
,
C.
, and
Knabe
,
G.
, 2004, “
Evaluation of Optimal Control for Active and Passive Building Thermal Storage
,”
HVAC&R Res.
1078-9669,
9
(
3
), pp.
259
275
.
15.
Henze
,
G.
,
Kalz
,
D.
,
Felsmann
,
C.
, and
Knabe
,
G.
, 2003, “
Impact of Forecasting Accuracy on Predictive Optimal Control of Active and Passive Building Thermal Storage Inventory
,”
HVAC&R Res.
1078-9669,
9
(
3
), pp.
259
275
.
16.
Liu
,
S.
, and
Henze
,
G. P.
, 2004, “
Impact of Modeling Accuracy on Predictive Optimal Control of Active and Passive Building Thermal Storage Inventory
,”
ASHRAE Trans.
0001-2505, Technical Paper No. 4683,
110
(
1
), pp.
151
163
.
17.
Henze
,
G. P.
,
Kalz
,
D.
,
Liu
,
S.
, and
Felsmann
,
C.
, 2005, “
Experimental Analysis of Model-Based Predictive Optimal Control for Active and Passive Building Thermal Storage Inventory
,”
HVAC&R Res.
1078-9669,
11
(
2
), pp.
189
214
, American Society of Heating, Refrigerating, and Air-Conditioning Engineers, Atlanta, GA.
18.
Kretchmar
,
R. M.
,
Young
,
P. M.
,
Anderson
,
C. W.
,
Hittle
,
D. C.
,
Anderson
,
M. L.
,
Delnero
,
C. C.
, and
Tu
,
J.
, 2001, “
Robust Reinforcement Learning Control With Static and Dynamic Stability
,”
Int. J. Robust Nonlinear Control
1049-8923,
11
, pp.
1469
1500
.
19.
Henze
,
G.
, and
Dodier
,
R.
, 2003, “
Adaptive Optimal Control of a Grid-Independent Photovoltaic System
,”
J. Sol. Energy Eng.
0199-6231,
125
(
1
), pp.
34
42
.
20.
Henze
,
G.
, and
Schoenmann
,
J.
, 2003, “
Evaluation of Reinforcement Learning Control for Thermal Energy Storage Systems
,”
HVAC&R Res.
1078-9669,
9
(
3
), pp.
259
275
.
21.
Watkins
,
C.
, and
Dayan
,
P.
, 1992, “
Q-Learning
,”
Mach. Learn.
0885-6125,
8
, pp.
279
292
.
22.
Brandemuehl
,
M. J.
, 1993,
HVAC 2 TOOLKIT
, 1st ed.,
American Society of Heating, Refrigerating and Air-Conditioning Engineers
,
Atlanta, GA
.
24.
Liu
,
S.
, and
Henze
,
G. P.
, 2004, “
Investigation of Reinforcement Learning for Building Thermal Mass Control
,”
Proceedings of SimBuild 2004
, Boulder, CO, August 4–6,
International Building Performance Simulation Association
.
25.
Dar
,
E.
, and
Mansour
,
Y.
, 2003, “
Learning Rates for q-Learning
,”
J. Mach. Learn. Res.
1532-4435,
5
(
1
), pp.
1
25
.
You do not currently have access to this content.