Firestore DeadlineExceeded exception for big collections
Multi tool use
up vote
3
down vote
favorite
I'm trying to read bigger collections from Google Firestore for testing and archiving purposes. I'm hitting some interesting errors when I try to get all documents from collections with more than 6k documents in them.
Naive Python solution
My first try was using the Python google-cloud-firestore
(version 0.30.0) library.
source_client = firestore.Client()
source = source_client.collection(collection)
source_data = source.get()
counter = 0
for f in source_data:
app.logger.info(f.id)
counter += 1
if counter % 100 == 0:
app.logger.info('%s %d', datetime.now(), counter)
app.logger.info('%s Finally read all %d documents', datetime.now(), counter)
Which gives the following output:
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
...
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
WARNING:flask.app:2018-11-08 09:49:04.128478 copy brocken by exception
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 142, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1000).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 588, in get
for index, response_pb in enumerate(response_iterator):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 83, in next
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
# Permission is hereby granted, free of charge, to any person obtaining a copy
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded
Which seems to be caused by a quota. Even if I cannot see it here. It seems to be time based because when I run with small sleeps in between elements I get less throughput and get the exception after ~50s.
Paging with Python
For exactly this problem there is a paging part in this library. As my application should not care about what kind of data I try to transfer I cannot use the start_after
interface but there is still an offset interface with which I can at least read in batches.
for f in source_collection.offset(last_read_offset).get():
Which gives me correct results as long as the last_read_offset
is below 1001. If I start with an offset of 1000 I can get results until I get the google.api_core.exceptions.DeadlineExceeded exception
from above. But when I start with something bigger I get:
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 144, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1001).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 599, in get
raise ValueError(msg)
ValueError: Unexpected server response. All responses other than the first must contain a document. The response at index 1 was
read_time
seconds: 1541668338
nanos: 420813000
skipped_results: 1
Looking at the library code it seems that the backend is sending a message which is interpreted as invalid.
Retry via node.js
Okay maybe my code or the Python client library is bugged. Let's try with node.
const admin = require('firebase-admin');
admin.initializeApp(
credential: admin.credential.applicationDefault()
);
var db = admin.firestore();
admin.firestore().settings( timestampsInSnapshots: true )
var counter = 0
console.log('Read collection')
db.collection(collection).get()
.then(querySnapshot =>
querySnapshot.forEach(documentSnapshot =>
counter++;
);
console.log(counter)
)
.catch( error =>
console.log(error)
);
Which does the same as the python library even if the timeout is much more clearly 60s.
[2018-11-09T08:36:30.992Z] App listening on port 8080
[2018-11-09T08:36:30.993Z] Press Ctrl+C to quit.
[2018-11-09T08:36:37.390Z] Read collection
[2018-11-09T08:37:37.406Z] Error: 4 DEADLINE_EXCEEDED: Deadline Exceeded
at Object.exports.createStatusError (/home/carsten/projects/node_modules/grpc/src/common.js:87:15)
at ClientReadableStream._emitStatusIfDone (/home/carsten/projects/node_modules/grpc/src/client.js:235:26)
at ClientReadableStream._readsDone (/home/carsten/projects/node_modules/grpc/src/client.js:201:8)
at /home/carsten/projects/node_modules/grpc/src/client_interceptors.js:679:15
code: 4,
metadata: Metadata _internal_repr: ,
details: 'Deadline Exceeded'
Has anyone similar experiences or a good hint how to continue?
PS: The exportDocument
/ importDocument
interface is not enough as we sometimes have to adjust the data after reading. And I have no clue what kind of format Firestore is exporting to Google Cloud Storage or how to transform it.
Edit: golang
And for the heck of it i tried the golang api.
log.Println("Collecting data")
snapshotIter := client.Collection(collection.(string)).Documents(ctx)
defer snapshotIter.Stop()
if err != nil
log.Fatalln(err)
i := 0
for
_, err := snapshotIter.Next()
if err == iterator.Done
break
if err != nil
log.Fatalln(err)
if i % 100 == 0
log.Println(i)
i++
log.Println("Done")
Which runs into the same timeout as expected.
2018/11/12 15:01:20 Collecting data
2018/11/12 15:01:21 0
2018/11/12 15:01:21 100
2018/11/12 15:01:21 200
2018/11/12 15:01:21 300
2018/11/12 15:01:21 400
2018/11/12 15:01:22 500
2018/11/12 15:01:22 600
2018/11/12 15:01:22 700
....
2018/11/12 15:02:22 29800
2018/11/12 15:02:23 29900
2018/11/12 15:02:23 rpc error: code = DeadlineExceeded desc = The datastore operation timed out, or the data was temporarily unavailable.
But in addition the offset is working fine:
snapshotIter := client.Collection(collection.(string)).Offset(30000).Documents(ctx)
python node.js firebase go google-cloud-firestore
New contributor
add a comment |
up vote
3
down vote
favorite
I'm trying to read bigger collections from Google Firestore for testing and archiving purposes. I'm hitting some interesting errors when I try to get all documents from collections with more than 6k documents in them.
Naive Python solution
My first try was using the Python google-cloud-firestore
(version 0.30.0) library.
source_client = firestore.Client()
source = source_client.collection(collection)
source_data = source.get()
counter = 0
for f in source_data:
app.logger.info(f.id)
counter += 1
if counter % 100 == 0:
app.logger.info('%s %d', datetime.now(), counter)
app.logger.info('%s Finally read all %d documents', datetime.now(), counter)
Which gives the following output:
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
...
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
WARNING:flask.app:2018-11-08 09:49:04.128478 copy brocken by exception
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 142, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1000).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 588, in get
for index, response_pb in enumerate(response_iterator):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 83, in next
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
# Permission is hereby granted, free of charge, to any person obtaining a copy
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded
Which seems to be caused by a quota. Even if I cannot see it here. It seems to be time based because when I run with small sleeps in between elements I get less throughput and get the exception after ~50s.
Paging with Python
For exactly this problem there is a paging part in this library. As my application should not care about what kind of data I try to transfer I cannot use the start_after
interface but there is still an offset interface with which I can at least read in batches.
for f in source_collection.offset(last_read_offset).get():
Which gives me correct results as long as the last_read_offset
is below 1001. If I start with an offset of 1000 I can get results until I get the google.api_core.exceptions.DeadlineExceeded exception
from above. But when I start with something bigger I get:
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 144, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1001).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 599, in get
raise ValueError(msg)
ValueError: Unexpected server response. All responses other than the first must contain a document. The response at index 1 was
read_time
seconds: 1541668338
nanos: 420813000
skipped_results: 1
Looking at the library code it seems that the backend is sending a message which is interpreted as invalid.
Retry via node.js
Okay maybe my code or the Python client library is bugged. Let's try with node.
const admin = require('firebase-admin');
admin.initializeApp(
credential: admin.credential.applicationDefault()
);
var db = admin.firestore();
admin.firestore().settings( timestampsInSnapshots: true )
var counter = 0
console.log('Read collection')
db.collection(collection).get()
.then(querySnapshot =>
querySnapshot.forEach(documentSnapshot =>
counter++;
);
console.log(counter)
)
.catch( error =>
console.log(error)
);
Which does the same as the python library even if the timeout is much more clearly 60s.
[2018-11-09T08:36:30.992Z] App listening on port 8080
[2018-11-09T08:36:30.993Z] Press Ctrl+C to quit.
[2018-11-09T08:36:37.390Z] Read collection
[2018-11-09T08:37:37.406Z] Error: 4 DEADLINE_EXCEEDED: Deadline Exceeded
at Object.exports.createStatusError (/home/carsten/projects/node_modules/grpc/src/common.js:87:15)
at ClientReadableStream._emitStatusIfDone (/home/carsten/projects/node_modules/grpc/src/client.js:235:26)
at ClientReadableStream._readsDone (/home/carsten/projects/node_modules/grpc/src/client.js:201:8)
at /home/carsten/projects/node_modules/grpc/src/client_interceptors.js:679:15
code: 4,
metadata: Metadata _internal_repr: ,
details: 'Deadline Exceeded'
Has anyone similar experiences or a good hint how to continue?
PS: The exportDocument
/ importDocument
interface is not enough as we sometimes have to adjust the data after reading. And I have no clue what kind of format Firestore is exporting to Google Cloud Storage or how to transform it.
Edit: golang
And for the heck of it i tried the golang api.
log.Println("Collecting data")
snapshotIter := client.Collection(collection.(string)).Documents(ctx)
defer snapshotIter.Stop()
if err != nil
log.Fatalln(err)
i := 0
for
_, err := snapshotIter.Next()
if err == iterator.Done
break
if err != nil
log.Fatalln(err)
if i % 100 == 0
log.Println(i)
i++
log.Println("Done")
Which runs into the same timeout as expected.
2018/11/12 15:01:20 Collecting data
2018/11/12 15:01:21 0
2018/11/12 15:01:21 100
2018/11/12 15:01:21 200
2018/11/12 15:01:21 300
2018/11/12 15:01:21 400
2018/11/12 15:01:22 500
2018/11/12 15:01:22 600
2018/11/12 15:01:22 700
....
2018/11/12 15:02:22 29800
2018/11/12 15:02:23 29900
2018/11/12 15:02:23 rpc error: code = DeadlineExceeded desc = The datastore operation timed out, or the data was temporarily unavailable.
But in addition the offset is working fine:
snapshotIter := client.Collection(collection.(string)).Offset(30000).Documents(ctx)
python node.js firebase go google-cloud-firestore
New contributor
You could log an issue with the Firebase support firebase.google.com/support/contact
– Renaud Tarnec
2 days ago
Good idea, thanks.
– Carsten Rietz
yesterday
add a comment |
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I'm trying to read bigger collections from Google Firestore for testing and archiving purposes. I'm hitting some interesting errors when I try to get all documents from collections with more than 6k documents in them.
Naive Python solution
My first try was using the Python google-cloud-firestore
(version 0.30.0) library.
source_client = firestore.Client()
source = source_client.collection(collection)
source_data = source.get()
counter = 0
for f in source_data:
app.logger.info(f.id)
counter += 1
if counter % 100 == 0:
app.logger.info('%s %d', datetime.now(), counter)
app.logger.info('%s Finally read all %d documents', datetime.now(), counter)
Which gives the following output:
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
...
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
WARNING:flask.app:2018-11-08 09:49:04.128478 copy brocken by exception
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 142, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1000).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 588, in get
for index, response_pb in enumerate(response_iterator):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 83, in next
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
# Permission is hereby granted, free of charge, to any person obtaining a copy
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded
Which seems to be caused by a quota. Even if I cannot see it here. It seems to be time based because when I run with small sleeps in between elements I get less throughput and get the exception after ~50s.
Paging with Python
For exactly this problem there is a paging part in this library. As my application should not care about what kind of data I try to transfer I cannot use the start_after
interface but there is still an offset interface with which I can at least read in batches.
for f in source_collection.offset(last_read_offset).get():
Which gives me correct results as long as the last_read_offset
is below 1001. If I start with an offset of 1000 I can get results until I get the google.api_core.exceptions.DeadlineExceeded exception
from above. But when I start with something bigger I get:
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 144, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1001).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 599, in get
raise ValueError(msg)
ValueError: Unexpected server response. All responses other than the first must contain a document. The response at index 1 was
read_time
seconds: 1541668338
nanos: 420813000
skipped_results: 1
Looking at the library code it seems that the backend is sending a message which is interpreted as invalid.
Retry via node.js
Okay maybe my code or the Python client library is bugged. Let's try with node.
const admin = require('firebase-admin');
admin.initializeApp(
credential: admin.credential.applicationDefault()
);
var db = admin.firestore();
admin.firestore().settings( timestampsInSnapshots: true )
var counter = 0
console.log('Read collection')
db.collection(collection).get()
.then(querySnapshot =>
querySnapshot.forEach(documentSnapshot =>
counter++;
);
console.log(counter)
)
.catch( error =>
console.log(error)
);
Which does the same as the python library even if the timeout is much more clearly 60s.
[2018-11-09T08:36:30.992Z] App listening on port 8080
[2018-11-09T08:36:30.993Z] Press Ctrl+C to quit.
[2018-11-09T08:36:37.390Z] Read collection
[2018-11-09T08:37:37.406Z] Error: 4 DEADLINE_EXCEEDED: Deadline Exceeded
at Object.exports.createStatusError (/home/carsten/projects/node_modules/grpc/src/common.js:87:15)
at ClientReadableStream._emitStatusIfDone (/home/carsten/projects/node_modules/grpc/src/client.js:235:26)
at ClientReadableStream._readsDone (/home/carsten/projects/node_modules/grpc/src/client.js:201:8)
at /home/carsten/projects/node_modules/grpc/src/client_interceptors.js:679:15
code: 4,
metadata: Metadata _internal_repr: ,
details: 'Deadline Exceeded'
Has anyone similar experiences or a good hint how to continue?
PS: The exportDocument
/ importDocument
interface is not enough as we sometimes have to adjust the data after reading. And I have no clue what kind of format Firestore is exporting to Google Cloud Storage or how to transform it.
Edit: golang
And for the heck of it i tried the golang api.
log.Println("Collecting data")
snapshotIter := client.Collection(collection.(string)).Documents(ctx)
defer snapshotIter.Stop()
if err != nil
log.Fatalln(err)
i := 0
for
_, err := snapshotIter.Next()
if err == iterator.Done
break
if err != nil
log.Fatalln(err)
if i % 100 == 0
log.Println(i)
i++
log.Println("Done")
Which runs into the same timeout as expected.
2018/11/12 15:01:20 Collecting data
2018/11/12 15:01:21 0
2018/11/12 15:01:21 100
2018/11/12 15:01:21 200
2018/11/12 15:01:21 300
2018/11/12 15:01:21 400
2018/11/12 15:01:22 500
2018/11/12 15:01:22 600
2018/11/12 15:01:22 700
....
2018/11/12 15:02:22 29800
2018/11/12 15:02:23 29900
2018/11/12 15:02:23 rpc error: code = DeadlineExceeded desc = The datastore operation timed out, or the data was temporarily unavailable.
But in addition the offset is working fine:
snapshotIter := client.Collection(collection.(string)).Offset(30000).Documents(ctx)
python node.js firebase go google-cloud-firestore
New contributor
I'm trying to read bigger collections from Google Firestore for testing and archiving purposes. I'm hitting some interesting errors when I try to get all documents from collections with more than 6k documents in them.
Naive Python solution
My first try was using the Python google-cloud-firestore
(version 0.30.0) library.
source_client = firestore.Client()
source = source_client.collection(collection)
source_data = source.get()
counter = 0
for f in source_data:
app.logger.info(f.id)
counter += 1
if counter % 100 == 0:
app.logger.info('%s %d', datetime.now(), counter)
app.logger.info('%s Finally read all %d documents', datetime.now(), counter)
Which gives the following output:
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
...
INFO:flask.app:2018-11-08 09:49:03.923795 6400
INFO:flask.app:2018-11-08 09:49:04.115410 6500
WARNING:flask.app:2018-11-08 09:49:04.128478 copy brocken by exception
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 142, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1000).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 588, in get
for index, response_pb in enumerate(response_iterator):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 83, in next
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
# Permission is hereby granted, free of charge, to any person obtaining a copy
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded
Which seems to be caused by a quota. Even if I cannot see it here. It seems to be time based because when I run with small sleeps in between elements I get less throughput and get the exception after ~50s.
Paging with Python
For exactly this problem there is a paging part in this library. As my application should not care about what kind of data I try to transfer I cannot use the start_after
interface but there is still an offset interface with which I can at least read in batches.
for f in source_collection.offset(last_read_offset).get():
Which gives me correct results as long as the last_read_offset
is below 1001. If I start with an offset of 1000 I can get results until I get the google.api_core.exceptions.DeadlineExceeded exception
from above. But when I start with something bigger I get:
Traceback (most recent call last):
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 144, in transfer
count_collection(source_collection)
File "/home/carsten/projects/transfertool/firestore/transfertool/main.py", line 94, in count_collection
for f in source_collection.offset(1001).get():
File "/home/carsten/projects/transfertool/venv/lib/python3.6/site-packages/google/cloud/firestore_v1beta1/query.py", line 599, in get
raise ValueError(msg)
ValueError: Unexpected server response. All responses other than the first must contain a document. The response at index 1 was
read_time
seconds: 1541668338
nanos: 420813000
skipped_results: 1
Looking at the library code it seems that the backend is sending a message which is interpreted as invalid.
Retry via node.js
Okay maybe my code or the Python client library is bugged. Let's try with node.
const admin = require('firebase-admin');
admin.initializeApp(
credential: admin.credential.applicationDefault()
);
var db = admin.firestore();
admin.firestore().settings( timestampsInSnapshots: true )
var counter = 0
console.log('Read collection')
db.collection(collection).get()
.then(querySnapshot =>
querySnapshot.forEach(documentSnapshot =>
counter++;
);
console.log(counter)
)
.catch( error =>
console.log(error)
);
Which does the same as the python library even if the timeout is much more clearly 60s.
[2018-11-09T08:36:30.992Z] App listening on port 8080
[2018-11-09T08:36:30.993Z] Press Ctrl+C to quit.
[2018-11-09T08:36:37.390Z] Read collection
[2018-11-09T08:37:37.406Z] Error: 4 DEADLINE_EXCEEDED: Deadline Exceeded
at Object.exports.createStatusError (/home/carsten/projects/node_modules/grpc/src/common.js:87:15)
at ClientReadableStream._emitStatusIfDone (/home/carsten/projects/node_modules/grpc/src/client.js:235:26)
at ClientReadableStream._readsDone (/home/carsten/projects/node_modules/grpc/src/client.js:201:8)
at /home/carsten/projects/node_modules/grpc/src/client_interceptors.js:679:15
code: 4,
metadata: Metadata _internal_repr: ,
details: 'Deadline Exceeded'
Has anyone similar experiences or a good hint how to continue?
PS: The exportDocument
/ importDocument
interface is not enough as we sometimes have to adjust the data after reading. And I have no clue what kind of format Firestore is exporting to Google Cloud Storage or how to transform it.
Edit: golang
And for the heck of it i tried the golang api.
log.Println("Collecting data")
snapshotIter := client.Collection(collection.(string)).Documents(ctx)
defer snapshotIter.Stop()
if err != nil
log.Fatalln(err)
i := 0
for
_, err := snapshotIter.Next()
if err == iterator.Done
break
if err != nil
log.Fatalln(err)
if i % 100 == 0
log.Println(i)
i++
log.Println("Done")
Which runs into the same timeout as expected.
2018/11/12 15:01:20 Collecting data
2018/11/12 15:01:21 0
2018/11/12 15:01:21 100
2018/11/12 15:01:21 200
2018/11/12 15:01:21 300
2018/11/12 15:01:21 400
2018/11/12 15:01:22 500
2018/11/12 15:01:22 600
2018/11/12 15:01:22 700
....
2018/11/12 15:02:22 29800
2018/11/12 15:02:23 29900
2018/11/12 15:02:23 rpc error: code = DeadlineExceeded desc = The datastore operation timed out, or the data was temporarily unavailable.
But in addition the offset is working fine:
snapshotIter := client.Collection(collection.(string)).Offset(30000).Documents(ctx)
python node.js firebase go google-cloud-firestore
python node.js firebase go google-cloud-firestore
New contributor
New contributor
edited yesterday
New contributor
asked Nov 9 at 13:26
Carsten Rietz
163
163
New contributor
New contributor
You could log an issue with the Firebase support firebase.google.com/support/contact
– Renaud Tarnec
2 days ago
Good idea, thanks.
– Carsten Rietz
yesterday
add a comment |
You could log an issue with the Firebase support firebase.google.com/support/contact
– Renaud Tarnec
2 days ago
Good idea, thanks.
– Carsten Rietz
yesterday
You could log an issue with the Firebase support firebase.google.com/support/contact
– Renaud Tarnec
2 days ago
You could log an issue with the Firebase support firebase.google.com/support/contact
– Renaud Tarnec
2 days ago
Good idea, thanks.
– Carsten Rietz
yesterday
Good idea, thanks.
– Carsten Rietz
yesterday
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Carsten Rietz is a new contributor. Be nice, and check out our Code of Conduct.
Carsten Rietz is a new contributor. Be nice, and check out our Code of Conduct.
Carsten Rietz is a new contributor. Be nice, and check out our Code of Conduct.
Carsten Rietz is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53226594%2ffirestore-deadlineexceeded-exception-for-big-collections%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
VnfLF2 4BfXPx72Ij,rMEZ I2t0oCt3Zg,etri Q R0s,7cigGyG G0U6Oue38Sorws4Vl,wwiYZvM6f,U yS
You could log an issue with the Firebase support firebase.google.com/support/contact
– Renaud Tarnec
2 days ago
Good idea, thanks.
– Carsten Rietz
yesterday