Firestore DeadlineExceeded exception for big collections

Multi tool use
Multi tool use








up vote
3
down vote

favorite
1












I'm trying to read bigger collections from Google Firestore for testing and archiving purposes. I'm hitting some interesting errors when I try to get all documents from collections with more than 6k documents in them.



Naive Python solution



My first try was using the Python google-cloud-firestore (version 0.30.0) library.



source_client = firestore.Client()
source = source_client.collection(collection)
source_data = source.get()

counter = 0
for f in source_data:
app.logger.info(f.id)
counter += 1
if counter % 100 == 0:
app.logger.info('%s %d', datetime.now(), counter)

app.logger.info('%s Finally read all %d documents', datetime.now(), counter)


Which gives the following output:



INFO:flask.app:2018-11-08 09:49:03.923795 6400 
INFO:flask.app:2018-11-08 09:49:04.115410 6500
...
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
WARNING:flask.app:2018-11-08 09:49:04.128478 copy brocken by exception
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 142, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1000).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 588, in get
for index, response_pb in enumerate(response_iterator):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 83, in next
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
# Permission is hereby granted, free of charge, to any person obtaining a copy
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded


Which seems to be caused by a quota. Even if I cannot see it here. It seems to be time based because when I run with small sleeps in between elements I get less throughput and get the exception after ~50s.



Paging with Python



For exactly this problem there is a paging part in this library. As my application should not care about what kind of data I try to transfer I cannot use the start_after interface but there is still an offset interface with which I can at least read in batches.



for f in source_collection.offset(last_read_offset).get():


Which gives me correct results as long as the last_read_offset is below 1001. If I start with an offset of 1000 I can get results until I get the google.api_core.exceptions.DeadlineExceeded exception from above. But when I start with something bigger I get:



Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 144, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1001).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 599, in get
raise ValueError(msg)
ValueError: Unexpected server response. All responses other than the first must contain a document. The response at index 1 was
read_time
seconds: 1541668338
nanos: 420813000

skipped_results: 1


Looking at the library code it seems that the backend is sending a message which is interpreted as invalid.



Retry via node.js



Okay maybe my code or the Python client library is bugged. Let's try with node.



const admin = require('firebase-admin');
admin.initializeApp(
credential: admin.credential.applicationDefault()
);

var db = admin.firestore();
admin.firestore().settings( timestampsInSnapshots: true )
var counter = 0

console.log('Read collection')
db.collection(collection).get()
.then(querySnapshot =>
querySnapshot.forEach(documentSnapshot =>
counter++;
);
console.log(counter)
)
.catch( error =>
console.log(error)
);


Which does the same as the python library even if the timeout is much more clearly 60s.



[2018-11-09T08:36:30.992Z] App listening on port 8080
[2018-11-09T08:36:30.993Z] Press Ctrl+C to quit.
[2018-11-09T08:36:37.390Z] Read collection
[2018-11-09T08:37:37.406Z] Error: 4 DEADLINE_EXCEEDED: Deadline Exceeded
at Object.exports.createStatusError (/home/carsten/projects/node_modules/grpc/src/common.js:87:15)
at ClientReadableStream._emitStatusIfDone (/home/carsten/projects/node_modules/grpc/src/client.js:235:26)
at ClientReadableStream._readsDone (/home/carsten/projects/node_modules/grpc/src/client.js:201:8)
at /home/carsten/projects/node_modules/grpc/src/client_interceptors.js:679:15
code: 4,
metadata: Metadata _internal_repr: ,
details: 'Deadline Exceeded'


Has anyone similar experiences or a good hint how to continue?



PS: The exportDocument / importDocument interface is not enough as we sometimes have to adjust the data after reading. And I have no clue what kind of format Firestore is exporting to Google Cloud Storage or how to transform it.



Edit: golang



And for the heck of it i tried the golang api.



log.Println("Collecting data")
snapshotIter := client.Collection(collection.(string)).Documents(ctx)
defer snapshotIter.Stop()

if err != nil
log.Fatalln(err)


i := 0

for
_, err := snapshotIter.Next()

if err == iterator.Done
break

if err != nil
log.Fatalln(err)


if i % 100 == 0
log.Println(i)

i++


log.Println("Done")


Which runs into the same timeout as expected.



2018/11/12 15:01:20 Collecting data
2018/11/12 15:01:21 0
2018/11/12 15:01:21 100
2018/11/12 15:01:21 200
2018/11/12 15:01:21 300
2018/11/12 15:01:21 400
2018/11/12 15:01:22 500
2018/11/12 15:01:22 600
2018/11/12 15:01:22 700
....
2018/11/12 15:02:22 29800
2018/11/12 15:02:23 29900
2018/11/12 15:02:23 rpc error: code = DeadlineExceeded desc = The datastore operation timed out, or the data was temporarily unavailable.


But in addition the offset is working fine:



snapshotIter := client.Collection(collection.(string)).Offset(30000).Documents(ctx)









share|improve this question









New contributor




Carsten Rietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



















  • You could log an issue with the Firebase support firebase.google.com/support/contact
    – Renaud Tarnec
    2 days ago










  • Good idea, thanks.
    – Carsten Rietz
    yesterday














up vote
3
down vote

favorite
1












I'm trying to read bigger collections from Google Firestore for testing and archiving purposes. I'm hitting some interesting errors when I try to get all documents from collections with more than 6k documents in them.



Naive Python solution



My first try was using the Python google-cloud-firestore (version 0.30.0) library.



source_client = firestore.Client()
source = source_client.collection(collection)
source_data = source.get()

counter = 0
for f in source_data:
app.logger.info(f.id)
counter += 1
if counter % 100 == 0:
app.logger.info('%s %d', datetime.now(), counter)

app.logger.info('%s Finally read all %d documents', datetime.now(), counter)


Which gives the following output:



INFO:flask.app:2018-11-08 09:49:03.923795 6400 
INFO:flask.app:2018-11-08 09:49:04.115410 6500
...
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
WARNING:flask.app:2018-11-08 09:49:04.128478 copy brocken by exception
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 142, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1000).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 588, in get
for index, response_pb in enumerate(response_iterator):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 83, in next
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
# Permission is hereby granted, free of charge, to any person obtaining a copy
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded


Which seems to be caused by a quota. Even if I cannot see it here. It seems to be time based because when I run with small sleeps in between elements I get less throughput and get the exception after ~50s.



Paging with Python



For exactly this problem there is a paging part in this library. As my application should not care about what kind of data I try to transfer I cannot use the start_after interface but there is still an offset interface with which I can at least read in batches.



for f in source_collection.offset(last_read_offset).get():


Which gives me correct results as long as the last_read_offset is below 1001. If I start with an offset of 1000 I can get results until I get the google.api_core.exceptions.DeadlineExceeded exception from above. But when I start with something bigger I get:



Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 144, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1001).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 599, in get
raise ValueError(msg)
ValueError: Unexpected server response. All responses other than the first must contain a document. The response at index 1 was
read_time
seconds: 1541668338
nanos: 420813000

skipped_results: 1


Looking at the library code it seems that the backend is sending a message which is interpreted as invalid.



Retry via node.js



Okay maybe my code or the Python client library is bugged. Let's try with node.



const admin = require('firebase-admin');
admin.initializeApp(
credential: admin.credential.applicationDefault()
);

var db = admin.firestore();
admin.firestore().settings( timestampsInSnapshots: true )
var counter = 0

console.log('Read collection')
db.collection(collection).get()
.then(querySnapshot =>
querySnapshot.forEach(documentSnapshot =>
counter++;
);
console.log(counter)
)
.catch( error =>
console.log(error)
);


Which does the same as the python library even if the timeout is much more clearly 60s.



[2018-11-09T08:36:30.992Z] App listening on port 8080
[2018-11-09T08:36:30.993Z] Press Ctrl+C to quit.
[2018-11-09T08:36:37.390Z] Read collection
[2018-11-09T08:37:37.406Z] Error: 4 DEADLINE_EXCEEDED: Deadline Exceeded
at Object.exports.createStatusError (/home/carsten/projects/node_modules/grpc/src/common.js:87:15)
at ClientReadableStream._emitStatusIfDone (/home/carsten/projects/node_modules/grpc/src/client.js:235:26)
at ClientReadableStream._readsDone (/home/carsten/projects/node_modules/grpc/src/client.js:201:8)
at /home/carsten/projects/node_modules/grpc/src/client_interceptors.js:679:15
code: 4,
metadata: Metadata _internal_repr: ,
details: 'Deadline Exceeded'


Has anyone similar experiences or a good hint how to continue?



PS: The exportDocument / importDocument interface is not enough as we sometimes have to adjust the data after reading. And I have no clue what kind of format Firestore is exporting to Google Cloud Storage or how to transform it.



Edit: golang



And for the heck of it i tried the golang api.



log.Println("Collecting data")
snapshotIter := client.Collection(collection.(string)).Documents(ctx)
defer snapshotIter.Stop()

if err != nil
log.Fatalln(err)


i := 0

for
_, err := snapshotIter.Next()

if err == iterator.Done
break

if err != nil
log.Fatalln(err)


if i % 100 == 0
log.Println(i)

i++


log.Println("Done")


Which runs into the same timeout as expected.



2018/11/12 15:01:20 Collecting data
2018/11/12 15:01:21 0
2018/11/12 15:01:21 100
2018/11/12 15:01:21 200
2018/11/12 15:01:21 300
2018/11/12 15:01:21 400
2018/11/12 15:01:22 500
2018/11/12 15:01:22 600
2018/11/12 15:01:22 700
....
2018/11/12 15:02:22 29800
2018/11/12 15:02:23 29900
2018/11/12 15:02:23 rpc error: code = DeadlineExceeded desc = The datastore operation timed out, or the data was temporarily unavailable.


But in addition the offset is working fine:



snapshotIter := client.Collection(collection.(string)).Offset(30000).Documents(ctx)









share|improve this question









New contributor




Carsten Rietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



















  • You could log an issue with the Firebase support firebase.google.com/support/contact
    – Renaud Tarnec
    2 days ago










  • Good idea, thanks.
    – Carsten Rietz
    yesterday












up vote
3
down vote

favorite
1









up vote
3
down vote

favorite
1






1





I'm trying to read bigger collections from Google Firestore for testing and archiving purposes. I'm hitting some interesting errors when I try to get all documents from collections with more than 6k documents in them.



Naive Python solution



My first try was using the Python google-cloud-firestore (version 0.30.0) library.



source_client = firestore.Client()
source = source_client.collection(collection)
source_data = source.get()

counter = 0
for f in source_data:
app.logger.info(f.id)
counter += 1
if counter % 100 == 0:
app.logger.info('%s %d', datetime.now(), counter)

app.logger.info('%s Finally read all %d documents', datetime.now(), counter)


Which gives the following output:



INFO:flask.app:2018-11-08 09:49:03.923795 6400 
INFO:flask.app:2018-11-08 09:49:04.115410 6500
...
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
WARNING:flask.app:2018-11-08 09:49:04.128478 copy brocken by exception
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 142, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1000).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 588, in get
for index, response_pb in enumerate(response_iterator):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 83, in next
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
# Permission is hereby granted, free of charge, to any person obtaining a copy
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded


Which seems to be caused by a quota. Even if I cannot see it here. It seems to be time based because when I run with small sleeps in between elements I get less throughput and get the exception after ~50s.



Paging with Python



For exactly this problem there is a paging part in this library. As my application should not care about what kind of data I try to transfer I cannot use the start_after interface but there is still an offset interface with which I can at least read in batches.



for f in source_collection.offset(last_read_offset).get():


Which gives me correct results as long as the last_read_offset is below 1001. If I start with an offset of 1000 I can get results until I get the google.api_core.exceptions.DeadlineExceeded exception from above. But when I start with something bigger I get:



Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 144, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1001).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 599, in get
raise ValueError(msg)
ValueError: Unexpected server response. All responses other than the first must contain a document. The response at index 1 was
read_time
seconds: 1541668338
nanos: 420813000

skipped_results: 1


Looking at the library code it seems that the backend is sending a message which is interpreted as invalid.



Retry via node.js



Okay maybe my code or the Python client library is bugged. Let's try with node.



const admin = require('firebase-admin');
admin.initializeApp(
credential: admin.credential.applicationDefault()
);

var db = admin.firestore();
admin.firestore().settings( timestampsInSnapshots: true )
var counter = 0

console.log('Read collection')
db.collection(collection).get()
.then(querySnapshot =>
querySnapshot.forEach(documentSnapshot =>
counter++;
);
console.log(counter)
)
.catch( error =>
console.log(error)
);


Which does the same as the python library even if the timeout is much more clearly 60s.



[2018-11-09T08:36:30.992Z] App listening on port 8080
[2018-11-09T08:36:30.993Z] Press Ctrl+C to quit.
[2018-11-09T08:36:37.390Z] Read collection
[2018-11-09T08:37:37.406Z] Error: 4 DEADLINE_EXCEEDED: Deadline Exceeded
at Object.exports.createStatusError (/home/carsten/projects/node_modules/grpc/src/common.js:87:15)
at ClientReadableStream._emitStatusIfDone (/home/carsten/projects/node_modules/grpc/src/client.js:235:26)
at ClientReadableStream._readsDone (/home/carsten/projects/node_modules/grpc/src/client.js:201:8)
at /home/carsten/projects/node_modules/grpc/src/client_interceptors.js:679:15
code: 4,
metadata: Metadata _internal_repr: ,
details: 'Deadline Exceeded'


Has anyone similar experiences or a good hint how to continue?



PS: The exportDocument / importDocument interface is not enough as we sometimes have to adjust the data after reading. And I have no clue what kind of format Firestore is exporting to Google Cloud Storage or how to transform it.



Edit: golang



And for the heck of it i tried the golang api.



log.Println("Collecting data")
snapshotIter := client.Collection(collection.(string)).Documents(ctx)
defer snapshotIter.Stop()

if err != nil
log.Fatalln(err)


i := 0

for
_, err := snapshotIter.Next()

if err == iterator.Done
break

if err != nil
log.Fatalln(err)


if i % 100 == 0
log.Println(i)

i++


log.Println("Done")


Which runs into the same timeout as expected.



2018/11/12 15:01:20 Collecting data
2018/11/12 15:01:21 0
2018/11/12 15:01:21 100
2018/11/12 15:01:21 200
2018/11/12 15:01:21 300
2018/11/12 15:01:21 400
2018/11/12 15:01:22 500
2018/11/12 15:01:22 600
2018/11/12 15:01:22 700
....
2018/11/12 15:02:22 29800
2018/11/12 15:02:23 29900
2018/11/12 15:02:23 rpc error: code = DeadlineExceeded desc = The datastore operation timed out, or the data was temporarily unavailable.


But in addition the offset is working fine:



snapshotIter := client.Collection(collection.(string)).Offset(30000).Documents(ctx)









share|improve this question









New contributor




Carsten Rietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I'm trying to read bigger collections from Google Firestore for testing and archiving purposes. I'm hitting some interesting errors when I try to get all documents from collections with more than 6k documents in them.



Naive Python solution



My first try was using the Python google-cloud-firestore (version 0.30.0) library.



source_client = firestore.Client()
source = source_client.collection(collection)
source_data = source.get()

counter = 0
for f in source_data:
app.logger.info(f.id)
counter += 1
if counter % 100 == 0:
app.logger.info('%s %d', datetime.now(), counter)

app.logger.info('%s Finally read all %d documents', datetime.now(), counter)


Which gives the following output:



INFO:flask.app:2018-11-08 09:49:03.923795 6400 
INFO:flask.app:2018-11-08 09:49:04.115410 6500
...
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
WARNING:flask.app:2018-11-08 09:49:04.128478 copy brocken by exception
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 142, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1000).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 588, in get
for index, response_pb in enumerate(response_iterator):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 83, in next
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
# Permission is hereby granted, free of charge, to any person obtaining a copy
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded


Which seems to be caused by a quota. Even if I cannot see it here. It seems to be time based because when I run with small sleeps in between elements I get less throughput and get the exception after ~50s.



Paging with Python



For exactly this problem there is a paging part in this library. As my application should not care about what kind of data I try to transfer I cannot use the start_after interface but there is still an offset interface with which I can at least read in batches.



for f in source_collection.offset(last_read_offset).get():


Which gives me correct results as long as the last_read_offset is below 1001. If I start with an offset of 1000 I can get results until I get the google.api_core.exceptions.DeadlineExceeded exception from above. But when I start with something bigger I get:



Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 144, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1001).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 599, in get
raise ValueError(msg)
ValueError: Unexpected server response. All responses other than the first must contain a document. The response at index 1 was
read_time
seconds: 1541668338
nanos: 420813000

skipped_results: 1


Looking at the library code it seems that the backend is sending a message which is interpreted as invalid.



Retry via node.js



Okay maybe my code or the Python client library is bugged. Let's try with node.



const admin = require('firebase-admin');
admin.initializeApp(
credential: admin.credential.applicationDefault()
);

var db = admin.firestore();
admin.firestore().settings( timestampsInSnapshots: true )
var counter = 0

console.log('Read collection')
db.collection(collection).get()
.then(querySnapshot =>
querySnapshot.forEach(documentSnapshot =>
counter++;
);
console.log(counter)
)
.catch( error =>
console.log(error)
);


Which does the same as the python library even if the timeout is much more clearly 60s.



[2018-11-09T08:36:30.992Z] App listening on port 8080
[2018-11-09T08:36:30.993Z] Press Ctrl+C to quit.
[2018-11-09T08:36:37.390Z] Read collection
[2018-11-09T08:37:37.406Z] Error: 4 DEADLINE_EXCEEDED: Deadline Exceeded
at Object.exports.createStatusError (/home/carsten/projects/node_modules/grpc/src/common.js:87:15)
at ClientReadableStream._emitStatusIfDone (/home/carsten/projects/node_modules/grpc/src/client.js:235:26)
at ClientReadableStream._readsDone (/home/carsten/projects/node_modules/grpc/src/client.js:201:8)
at /home/carsten/projects/node_modules/grpc/src/client_interceptors.js:679:15
code: 4,
metadata: Metadata _internal_repr: ,
details: 'Deadline Exceeded'


Has anyone similar experiences or a good hint how to continue?



PS: The exportDocument / importDocument interface is not enough as we sometimes have to adjust the data after reading. And I have no clue what kind of format Firestore is exporting to Google Cloud Storage or how to transform it.



Edit: golang



And for the heck of it i tried the golang api.



log.Println("Collecting data")
snapshotIter := client.Collection(collection.(string)).Documents(ctx)
defer snapshotIter.Stop()

if err != nil
log.Fatalln(err)


i := 0

for
_, err := snapshotIter.Next()

if err == iterator.Done
break

if err != nil
log.Fatalln(err)


if i % 100 == 0
log.Println(i)

i++


log.Println("Done")


Which runs into the same timeout as expected.



2018/11/12 15:01:20 Collecting data
2018/11/12 15:01:21 0
2018/11/12 15:01:21 100
2018/11/12 15:01:21 200
2018/11/12 15:01:21 300
2018/11/12 15:01:21 400
2018/11/12 15:01:22 500
2018/11/12 15:01:22 600
2018/11/12 15:01:22 700
....
2018/11/12 15:02:22 29800
2018/11/12 15:02:23 29900
2018/11/12 15:02:23 rpc error: code = DeadlineExceeded desc = The datastore operation timed out, or the data was temporarily unavailable.


But in addition the offset is working fine:



snapshotIter := client.Collection(collection.(string)).Offset(30000).Documents(ctx)






python node.js firebase go google-cloud-firestore






share|improve this question









New contributor




Carsten Rietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Carsten Rietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited yesterday





















New contributor




Carsten Rietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Nov 9 at 13:26









Carsten Rietz

163




163




New contributor




Carsten Rietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Carsten Rietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Carsten Rietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











  • You could log an issue with the Firebase support firebase.google.com/support/contact
    – Renaud Tarnec
    2 days ago










  • Good idea, thanks.
    – Carsten Rietz
    yesterday
















  • You could log an issue with the Firebase support firebase.google.com/support/contact
    – Renaud Tarnec
    2 days ago










  • Good idea, thanks.
    – Carsten Rietz
    yesterday















You could log an issue with the Firebase support firebase.google.com/support/contact
– Renaud Tarnec
2 days ago




You could log an issue with the Firebase support firebase.google.com/support/contact
– Renaud Tarnec
2 days ago












Good idea, thanks.
– Carsten Rietz
yesterday




Good idea, thanks.
– Carsten Rietz
yesterday

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);






Carsten Rietz is a new contributor. Be nice, and check out our Code of Conduct.









 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53226594%2ffirestore-deadlineexceeded-exception-for-big-collections%23new-answer', 'question_page');

);

Post as a guest



































active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes








Carsten Rietz is a new contributor. Be nice, and check out our Code of Conduct.









 

draft saved


draft discarded


















Carsten Rietz is a new contributor. Be nice, and check out our Code of Conduct.












Carsten Rietz is a new contributor. Be nice, and check out our Code of Conduct.











Carsten Rietz is a new contributor. Be nice, and check out our Code of Conduct.













 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53226594%2ffirestore-deadlineexceeded-exception-for-big-collections%23new-answer', 'question_page');

);

Post as a guest














































































VnfLF2 4BfXPx72Ij,rMEZ I2t0oCt3Zg,etri Q R0s,7cigGyG G0U6Oue38Sorws4Vl,wwiYZvM6f,U yS
CJ3enlx4vH hzF,08,HNxn9hESLhbOZzarm Q

Popular posts from this blog

Use pre created SQLite database for Android project in kotlin

Darth Vader #20

Ondo