Caching in Dynamic IoT Networks by Deep Reinforcement Learning


The sensing service of Internet-of-Things (IoT) networks enables IoT sensors to sense the environment information (e.g., temperature and traffic conditions) and send them through the IoT gateway to the users who request those information. The explosive growth of IoT users and sensors injects massive traffic into IoT networks and easily depletes the battery of IoT sensors. Caching at the IoT gateway is hence a promising solution to mitigate this problem by storing popular IoT data at the IoT gateway and sending them directly to the users instead of activating IoT sensors to transmit the data. In our work, we investigate the content placement problem, which determines data to be cached at each time epoch in dynamic IoT networks with the objective to minimize the average data transmission delay constrained by the cache storage capacity and IoT data freshness. We formulate our problem as an integer linear programming (ILP) problem and then model it as a Markov decision process (MDP). A deep reinforcement learning algorithm is proposed to solve this problem and its performances are demonstrated via extensive simulations.



Software And Hardware