Many aspects of the Analytics platform, when viewed from a developer's perspective, become clearer by understanding the data flow.
This article will describe in detail what happens from the time you invoke a monitor API call to when the data appears in a graph on the dashboard. After reading this article you should have a better understanding of these issues:
- When is my data saved?
- Is there any risk of losing data?
- When and how is data transmitted to the server?
- What happens if my device is offline?
- Where can I inspect the data? Files, network traffic, at the server?
This image illustrates the overall data flow from app via monitor to the server.
Data collected between
Stop is called a session. A session therefore typically represents one invocation, or run, of the application. Data is comprised of automatically gathered data (like total runtime and environment data) as well as user-provided data (tracked features, exceptions, etc).
Inside the monitor the session data is handled like so:
- Recorded data is stored in-memory
- Every minute this session data will be saved to disk
- Upon important events the session will be sent to the server
- Aborted sessions are handled at startup, too
Session data is saved to disk every 60 seconds by default. This means that, at most, 60 seconds of recorded statistics could be lost, worst-case. The interval can be adjusted via the
Session data is saved to disk asyncronously via a background process and sent to the server when
- 20000 feature-timings or feature-values has been collected
- 24 hours have passed, every day
Start the monitor submits the session data collected so far. Upon
Stop it will again submit the current session data and then wait for it for 2 seconds (by default) to be delivered before logging that it stopped. If logging is enabled then a typical monitor life-cycle will therefore look like this if only
Stop is called:
The session data file will only be deleted when the session data has been received and accepted by the Analytics servers and the monitor received a successful HTTP-response. Otherwise it will be kept for later discovery and re-submission by the monitor. Here is an illustration of the successful submission:
Note that there is really no deterministic way for the user of the monitor to be notified of successful or failed delivery of a specific piece of data. The monitor batches data and delivers the batches asynchronously to the servers and does not notify you about what data is being delivered.
It follows from the above behavior that you should always call Stop to ensure data is at least safely saved, and hopefully also successfully submitted.
There are a couple of tricky scenarios to be aware of. Say you have added a well-placed call to
Stop in your application's (platform dependent) shutdown code. Normally this will ensure that
Stop is called when your application exits, but not always:
- If you are running your application within a debugger (e.g. Visual Studio for .NET) and stop it then the application will come to an abrubt halt and not run any OnExit-code. It is hard to give good recommendations on this, since it is a pretty natural flow for developers to just stop the application they are running.
- If you track an exception using
TrackExceptionand then immediately exit the application (for instance, inside an unhandled exception handler) then the monitor's background thread may not have saved the exception to file yet, let alone have time to submit it to the Analytics servers. In that particular scenario you should explicitly call
Stopright after the
Data is stored on the file-system in a platform-specific manner. Please see the section on Local Data Storage for an elaborate description of where the data is saved and how to inspect it.
If there is not enough allowed space in the storage for this new session then it will not be saved to file but only kept in memory. The allowed space can be adjusted via the
MaxStorageSizeInKB setting. By default there is no upper limit but the individual platforms may impose limitations; for instance, Silverlight apps have a default limit of 1MB available for their Isolated Storage.
Data will only be lost if cannot be saved to disk and also cannot be sent to the server.
An online application will almost always succeed in submitting the session data. However, invariably a situation will arise where data cannot be sent across the network to the Analytics server - or maybe cannot be sent before the application shuts down. Then what happens?
Data is always saved to storage (disk) first. If network submission fails then that data file will simply be kept and not deleted:
The next question obviously is: when is this data then submitted next?
The answer is: the next time monitor
Start is called. At
Start, the monitor will look for unsent, abandoned session data files and read and submit them (presently at most 5 of them at a time) along with the current session. Upon successfull submission they are deleted:
Here are some prime examples of when this scenario could occur:
- User is offline on vacation using your app. All data sessions will be stored locally. They will be sent the next time the app is started and is online.
- Your application is crashing. You track the unhandled exception using
TrackException. Data is saved to disk synchronously but right after
TrackExceptionthe application exits, killing the monitor's background submission thread which aborts the submission. Next time the application is started, the session data file containing the exception report will be found and submitted.
For more information on the nature of the monitor network traffic, please see the section on Network Traffic.
Usually the monitor will deliver data regularly and seamlessly when connected. You should never worry about data being delivered nor have to take any manual action for submission to occur.
Yet the monitor API has a
ForceSync-method. What's the deal? It it an admission of failure on the monitor's ability to submit data appropriately? And if not, when should you use it?
The short answers are: No, but don't use ForceSync unless you really, really have to.
The monitor usually delivers data just fine even for only occasionally connected applications. But in some special scenarios it can be useful to override the automatic behavior if you know better. Here are two prime examples:
- If it is imperative to make sure that a certain
TrackFeatureat startup is always delivered, e.g. because you suspect a situation where the app may crash. In that case you should call
ForceSyncright after the feature-tracking.
- If your applications controls the connectivity tightly and is only connected for a brief time (imagine an old-school smart terminal application that calls up just briefly to deliver data and then hang up again) then it can make sense to set
SynchronizeAutomaticallyto false and initiate the data delivery manually using
ForceSyncduring that explicit period of connectivity, because only you know when your application is likely to be connected.
Both examples are rare, so the guidance still stand: use the
ForceSync only if you really have to.
Once data arrives at our servers it is processed in bulk and inserted into the Analytics data store. This is typically delayed for up to 5 minutes. Once inserted into the data store it can be viewed in Live Mode.
The server may reject the data for several reasons. Some reasons will cause the monitor to recognize that an error has occurred and that it should resubmit the data later. Other reasons will have the monitor believe that everything went well so it will delete the local data.
Here is how the monitor will react to all the various behaviors of the server:
- Success status 200, data accepted: This is the normal scenario. No problems.
Success status 200, data rejected: The server can actively and silently discard the data, but still send back success status 200 to let the monitor believe that all went well so it will delete the local data copy and not resubmit data agin. The return message will contain the explanation for rejection. Here are the scenarios:
- Data is sent to an unknown product key, for instance from a no longer tracked product or as part of a DOS-attack
- The data plan for the product has been exceeded, so e.g. no more data is acceepted this month
- Data is more than 30 days old; the server only accepts and retroactively aggregates data that has been recorded within the last 30 days.
- The data is corrupted
- Error status 5xx: An internal server error will result in a 5xx status code. The monitor will retain the data for resubmission later. This should generally never happen.
- Failure to connect: As explained above this will simply cause the monitor to retain the session data locally and try to resubmit at a later point. If the Analytics servers are briefly taken down for upgrade or maintenance, this is also what will happen.
Most of the data received by the server will be from applications running right now. But data may be received that was recorded earlier and e.g. been waiting for resubmission on an offline device.
Data older than 30 days are simply discarded. If your device cannot deliver data within 30 days then it will not make it into the Analytics server storage.
Data younger than 30 days are added to the appropriate days. This means that the graphs you are viewing for the last 30 days are not static but can change. If you print out a report of the last 30 days and repeat that one week later, you will most likely see that the reported data has increased slightly due to the new (old) data.
Once accepted by the Analytics server the data will remain intact until it becomes older than what is supported by the account data plan. If the data plan supports a history window of 14 days this means that data older than 14 days will simply be deleted at the server.
This article has shown how data is handled throughout the Analytics infrastructure - from in-memory structure in the monitor to locally stored files on the device, across the network, and finally inside the Analytics servers.