*5.1. Discussion of Alternative System Components and Potential System Enhancements* 5.1.1. Cost Analysis of Cloud Services

The first two versions of the databases created for this application example were hosted in a MySQL database using Amazon Aurora [41] and then Amazon RDS [36]. For the application example needs, Aurora and RDS costs presented a constraint, which is the reason we chose two EC2 instances to host MySQL and Grafana that meet user requirements at a lower average cost. The current virtual machine cyber infrastructure costs between USD 24 and USD 210/year, depending on how long the EC2 instances will be required to be available. However, the database hosted this way may require maintenance such as updating software, or services to fix bugs, along with providing no regional failover. In the event that an AWS region experiences an outage, regional failover allows a copy of the database hosted in a separate region to quickly take over operations. Since in our use case we might not need Grafana and the MySQL database to be always available, the EC2 instances can be shut down and only started under demand, for example when users expect an incoming storm. Turning off the EC2 instances reduces the recurring costs to only the instance's storage units, which costs around USD 12/year for each instance using currently 10 GB of memory space or around USD 24/year for both EC2 instances. Should the application require seamless regional failover and high database performance, one alternative solution is the provision of two redundant instances running Amazon RDS for MySQL with multi-availability zone support. This configuration's estimated costs are USD 623.28/year, considering on-demand instance base costs and 10 GB of SSD storage. Memory storage calculations and their associated costs with S3 and the database configurations were based on the sensors used in the proof-of-concept system (Table 1).


**Table 1.** Adopted Sensors for the Proof-of-Concept IoT System.

Our calculations for Tables 2–7 were carried out based on the current sensor device configuration of the system (Table 1), pricing rates at the development time (January 2023), and a projected 5-year use. The default measurement frequency for the system is 1 measurement every 10 min, averaging 4380 readings per month. To account for temporary measurement frequency increases during storm events, calculations instead used a figure of 4800 readings per month. One csv file is uploaded every hour to S3 for each registered TTN application, with each write request to S3 costing USD 0.000005. Sensor devices currently in use are one eleven parameter weather station (DL-ATM41), one pressure/liquid level and temperature sensor (DL-PR26), and two ultrasonic distance/level sensors (DL-MBX), with one TTN application for each sensor model type, resulting in a total of three TTN applications. The average payload size for these four sensors is 343 bytes after parsing and transforming, and the csv file header average size is 822 bytes. Since the weather station contains more measurements per reading than the other two sensor types, its sampling frequency has the most significant impact in the used data storage space. It is important to note that, when data is stored in the MySQL database, the weather station requires almost five times as much storage capacity as either of the other two sensor devices. Since the current system is based on these four sensor devices, AWS storage configurations may need to be readjusted based on the chosen sensors for the application's system.

**Table 2.** MySQL database storage (MB) requirements over time per device type (4800 readings/ month).


**Table 3.** S3 storage costs calculations for generic sensor devices in the first year (343 Bytes/sensor payload, 828 Bytes/csv header, 4800 sensor payloads/month, and 3 TTN applications).


<sup>1</sup> Only one TTN application was considered for this case.


**Table 4.** MySQL database storage (GB) requirements over time based on number of generic IoT devices (1 kB/reading and 4800 readings/month).

**Table 5.** Database cost on single EC2 instance (t3.micro) assuming a single instance, storage requirements for 5 years, and generic IoT devices (1 kB/reading and 4800 readings/month).


**Table 6.** Database cost on 2 separate RDS EC2 instance (db.t3.micro) with multi-availability zone deployment and assuming generic IoT devices (1 kB/reading and 4800 readings/month).


**Table 7.** Database cost on Aurora (t3.small) <sup>1</sup> and assuming generic IoT devices (1 kB/reading and 4800 readings/month).


<sup>1</sup> For 50 devices and more, we estimated higher IOPS to handle the average measurement writing load. The cost also includes running 2 EC2 instances by default for regional failover.

The overall yearly system costs can also be lowered by configuring S3 and EC2 instance provisioning and by using built-in AWS cost optimization tools. For S3, if the backup data will not be frequently accessed, it is recommended to change the access tiers of the data. For this application example, the data is stored under Standard tier, which costs USD 0.023 per GB. In future iterations of the system, it is recommended to use Intelligent-Tiering: Standard-Infrequent Access (USD 0.0125 per GB)**,** One Zone-Infrequent Access (USD 0.01 per GB), or even Glacier tiers (USD 0.004 per GB). For the Infrequently Accessed and Glacier tiers, there is a retrieval fee for every gigabyte retrieved. Infrequently Accessed will allow for millisecond latency to the user when requesting data, whereas with Glacier it can take minutes or hours. Deleting data from non-standard S3 tiers before their minimum storage durations will charge the user for the respective minimum storage durations. Infrequently Accessed and Glacier tiers also have a minimum capacity charge per object, so it is recommended to combine individual readings into larger datasets (i.e., monthly readings per sensor) to store as one file in these tiers. To reduce costs of EC2 instance provisioning (including for RDS and Aurora), AWS allows for reserving instances in 1 and 3 year increments instead of using on-demand instances, bringing costs down by up to 38%. The costs calculated in this paper use the current configuration of the system which uses on-demand EC2 instances and consider they will remain always on.

As shown in Table 2, our current configuration of four sensors reporting on average between six and seven samples per hour (4800 samples/month) results in a MySQL database of less than 140 MB of data at the end of the first year of operation. In Table 3, we show that storing this amount of data in AWS S3 service would cost USD 0.14 for the first year and even scaling to 100 sensors with the same average data rate would result in USD 0.38 storage costs. This indicates that many small to medium scale applications could benefit from this data storage service to backup sensor data at low costs.

In Table 4, we estimate the size of a MySQL database for the first five years, assuming generic sensor samples of 1 kB size being uploaded at the rate of 4800 samples/month as we adopted in our example application. The estimated MySQL database size is then used to inform the storage requirement of the virtual machines hosting the respective MySQL databases as shown in Table 5. Our system with 4 sensors would cost about USD 97.10/year with each one GB increase in storage space resulting in an additional cost of USD 1.20/year. This analysis shows that the uptime of EC2 servers has the greatest impact on the overall system cost and turning them off while they are not required can result in substantial savings. To reduce costs even further, MySQL server disk images can be saved in the S3 data storage service, eliminating EC2 server costs while they are shut down for long periods.

As a brief exploration of the alternative robust database services offered by AWS, we assume, in Table 6, two Amazon RDS EC2 instances with multi-availability zone deployment, and, in Table 7, the Amazon Aurora managed database on a more powerful EC2 instance. Both solutions result in total costs over USD 600/year, representing six times the cost of running a database in a single EC2 MySQL server. Therefore, we recommend using our proposed EC2 MySQL server solution when a failover system is not critical to the application due to the substantial cost savings.

In Figure 8, we estimate how the cost of S3 data storage varies with sampling rate, operation total duration, number of sensors, and sampling rate. For these calculations we used a simplified estimation model considering only a fee of USD 0.023 per GB stored, and USD 0.000005 fee of per write request. As in the tables previously introduced, we assume up to three TTN applications and one data request and ingestion operation per hour.

With the S3 storage costs curves depicted in Figure 8, IoT application developers can estimate how the number of sensors and data rate parameters influence the total S3 storage costs, as well as how these costs accumulate with time. For instance, in Figure 8d, we can verify that the cost of S3 data storage of an application with 50 sensors for the first ten years is comparable to an application with 200 sensors for the first five years.

**Figure 8.** S3 storage costs with varying parameters. Plots (**a**,**c**) evaluate the total cost of S3 data storage at the end of 5 years. Plot (**b**,**d**) assume devices with sampling rate of 4800 samples per month.
