问题
I am trying to process a csv files which contains more than 20000 patient information. There are totally 50 columns and each patient will have multiple rows as its the hourly data . Most of the columns belong to Observation resource type. Like Heart Rate, Temperature, Blood Pressure.
I have successfully transformed the data into FHIR format. however, when i try to push the data inside FHIR server, the server throws an error saying maximum of 500 entries are only allowed for the data.
Even if i wait up to 500 entries and push the json file, its taking quite a lot time to cover up 20000 * 50 . Is there any efficient way of bulk inserting the data into the azure fhir server ?
Currently , i am using the following code. But looks like its going to take quite a lot time and resource. As there are around 0.7 million rows in my csv file.
def export_template(self, template):
if self.export_max_500 is None:
self.export_max_500 = template
else:
export_max_500_entry = self.export_max_500["entry"]
template_entry = template["entry"]
self.export_max_500["entry"] = export_max_500_entry + template_entry
if len(self.export_max_500["entry"]) > 500:
template["entry"] = self.export_max_500["entry"][:495]
self.export_max_500["entry"] = self.export_max_500["entry"][495:]
self.send_to_server(template)
回答1:
The most efficient way is not to send multiple (batch) bundles. It is actually to do many individual requests running in parallel. Your problem is that you are sending these in sequentially and taking a huge hit on the round-trip time. You can take a look at something like this loader: https://github.com/hansenms/FhirLoader, which parallelizes the requests. You will also want to up the RUs on your service to make sure you have enough throughput to get the data in.
来源:https://stackoverflow.com/questions/61483216/sending-bulk-data-to-azure-fhir-server