@crunklord420 Having worked with all of the products you've mentioned, what do you wanna know?
As for Prometheus/grafana, it works for simple monitoring stuff. If you wish to add complexity you can look at Prometheus its AlertManager.
@crunklord420 I would say that its normal use-case to be honest. A lot of software has Prometheus support and when it comes to Cloud Native ( 🤢 ) software they are the go-to of self-hosted solutions.
As for me, I've used them to monitor 100+ boxes (using node_exporter) and the devs used it to monitor their shitty single-instance applications. Its fairly rock solid, feed it enough diskspace and memory and you can a very long time without having to look at it.
@crunklord420 You tell Prometheus how long you wish to retain the information you scraped from the exporters. By default this is 21 days and its fine.
You have to keep in mind that Prometheus is built to give you an accurate picture of what is happening here and now. So its excellent with frequent updates and storing loads of data in its TSDB. For the long term it isn't recommended. ( I mean, I did it at work, had like a year or so retention, 1.5TB on a simple t2.medium box. It worked, you could go back really far and it had great detail. But damn, you could here the instance just generating a fire in AWS's datacenter. )
Ultimately it depends how much you wish to store for how long which in turn determines the size. As for pruning, this happens automagically, but this happens for all the time-series, not specific ones.
@icedquinn
Yeah, its easy. Just verify with promtool it isn't absolutely retarded and you are good to go.