Tensorflow batch: keep result as strings
This simple program
import tensorflow as tf
input = 'string'
batch = tf.train.batch([tf.constant(input)], batch_size=1)
with tf.Session() as sess:
tf.train.start_queue_runners()
output, = sess.run(batch)
print(1, input, output)
print(2, str(output, 'utf-8'))
print(3, input.split('i'))
print(4, str(output, 'utf-8').split('i'))
print(5, output.split('i'))
prints
1 string b'string'
2 string
3 ['str', 'ng']
4 ['str', 'ng']
ERROR:tensorflow:Exception in QueueRunner: Session has been closed.
print(5, output.split('i'))
TypeError: a bytes-like object is required, not 'str'
Why isn't the result a list of strings, if the input is?
OK, @jdehesa explained WHY, but not how to 'fix' it. I can apply bytes.decode() to the results of session:
output, = map(bytes.decode, sess.run(batch))
And there exists tf.map_fn() that should do the same on tensors. The only question is how I can use this in my scenario?
PS: actually, the error message is puzzling, too. The problem is that we provide a bytes object, not a string. But the TypeError suggests exactly the opposite.
PPS: the error message explained, thanks to @jdehesa: it was about the split()'s parameter, not the object. output.split(b'i') works well!
python tensorflow
add a comment |
This simple program
import tensorflow as tf
input = 'string'
batch = tf.train.batch([tf.constant(input)], batch_size=1)
with tf.Session() as sess:
tf.train.start_queue_runners()
output, = sess.run(batch)
print(1, input, output)
print(2, str(output, 'utf-8'))
print(3, input.split('i'))
print(4, str(output, 'utf-8').split('i'))
print(5, output.split('i'))
prints
1 string b'string'
2 string
3 ['str', 'ng']
4 ['str', 'ng']
ERROR:tensorflow:Exception in QueueRunner: Session has been closed.
print(5, output.split('i'))
TypeError: a bytes-like object is required, not 'str'
Why isn't the result a list of strings, if the input is?
OK, @jdehesa explained WHY, but not how to 'fix' it. I can apply bytes.decode() to the results of session:
output, = map(bytes.decode, sess.run(batch))
And there exists tf.map_fn() that should do the same on tensors. The only question is how I can use this in my scenario?
PS: actually, the error message is puzzling, too. The problem is that we provide a bytes object, not a string. But the TypeError suggests exactly the opposite.
PPS: the error message explained, thanks to @jdehesa: it was about the split()'s parameter, not the object. output.split(b'i') works well!
python tensorflow
I am not sure I understand the question/problem. The result is a list of strings, just of length one in this case, due to the definition ofbatch. Make itbatch_size=2and you will have[b'string' b'string']as output, etc.
– Uvar
Nov 13 '18 at 10:33
@Uvar, I have clarified my question
– Alex Cohn
Nov 13 '18 at 10:55
It looks like, session.py has a class_ElementFetchMapperwhich has_contraction_fn- a function that is responsible, also, for converting the results to expected types. But_REGISTERED_EXPANSIONSis a constant in tensorflow 1.12.
– Alex Cohn
Nov 13 '18 at 16:32
add a comment |
This simple program
import tensorflow as tf
input = 'string'
batch = tf.train.batch([tf.constant(input)], batch_size=1)
with tf.Session() as sess:
tf.train.start_queue_runners()
output, = sess.run(batch)
print(1, input, output)
print(2, str(output, 'utf-8'))
print(3, input.split('i'))
print(4, str(output, 'utf-8').split('i'))
print(5, output.split('i'))
prints
1 string b'string'
2 string
3 ['str', 'ng']
4 ['str', 'ng']
ERROR:tensorflow:Exception in QueueRunner: Session has been closed.
print(5, output.split('i'))
TypeError: a bytes-like object is required, not 'str'
Why isn't the result a list of strings, if the input is?
OK, @jdehesa explained WHY, but not how to 'fix' it. I can apply bytes.decode() to the results of session:
output, = map(bytes.decode, sess.run(batch))
And there exists tf.map_fn() that should do the same on tensors. The only question is how I can use this in my scenario?
PS: actually, the error message is puzzling, too. The problem is that we provide a bytes object, not a string. But the TypeError suggests exactly the opposite.
PPS: the error message explained, thanks to @jdehesa: it was about the split()'s parameter, not the object. output.split(b'i') works well!
python tensorflow
This simple program
import tensorflow as tf
input = 'string'
batch = tf.train.batch([tf.constant(input)], batch_size=1)
with tf.Session() as sess:
tf.train.start_queue_runners()
output, = sess.run(batch)
print(1, input, output)
print(2, str(output, 'utf-8'))
print(3, input.split('i'))
print(4, str(output, 'utf-8').split('i'))
print(5, output.split('i'))
prints
1 string b'string'
2 string
3 ['str', 'ng']
4 ['str', 'ng']
ERROR:tensorflow:Exception in QueueRunner: Session has been closed.
print(5, output.split('i'))
TypeError: a bytes-like object is required, not 'str'
Why isn't the result a list of strings, if the input is?
OK, @jdehesa explained WHY, but not how to 'fix' it. I can apply bytes.decode() to the results of session:
output, = map(bytes.decode, sess.run(batch))
And there exists tf.map_fn() that should do the same on tensors. The only question is how I can use this in my scenario?
PS: actually, the error message is puzzling, too. The problem is that we provide a bytes object, not a string. But the TypeError suggests exactly the opposite.
PPS: the error message explained, thanks to @jdehesa: it was about the split()'s parameter, not the object. output.split(b'i') works well!
python tensorflow
python tensorflow
edited Nov 13 '18 at 15:16
Alex Cohn
asked Nov 13 '18 at 9:54
Alex CohnAlex Cohn
41.4k552188
41.4k552188
I am not sure I understand the question/problem. The result is a list of strings, just of length one in this case, due to the definition ofbatch. Make itbatch_size=2and you will have[b'string' b'string']as output, etc.
– Uvar
Nov 13 '18 at 10:33
@Uvar, I have clarified my question
– Alex Cohn
Nov 13 '18 at 10:55
It looks like, session.py has a class_ElementFetchMapperwhich has_contraction_fn- a function that is responsible, also, for converting the results to expected types. But_REGISTERED_EXPANSIONSis a constant in tensorflow 1.12.
– Alex Cohn
Nov 13 '18 at 16:32
add a comment |
I am not sure I understand the question/problem. The result is a list of strings, just of length one in this case, due to the definition ofbatch. Make itbatch_size=2and you will have[b'string' b'string']as output, etc.
– Uvar
Nov 13 '18 at 10:33
@Uvar, I have clarified my question
– Alex Cohn
Nov 13 '18 at 10:55
It looks like, session.py has a class_ElementFetchMapperwhich has_contraction_fn- a function that is responsible, also, for converting the results to expected types. But_REGISTERED_EXPANSIONSis a constant in tensorflow 1.12.
– Alex Cohn
Nov 13 '18 at 16:32
I am not sure I understand the question/problem. The result is a list of strings, just of length one in this case, due to the definition of
batch. Make it batch_size=2 and you will have [b'string' b'string'] as output, etc.– Uvar
Nov 13 '18 at 10:33
I am not sure I understand the question/problem. The result is a list of strings, just of length one in this case, due to the definition of
batch. Make it batch_size=2 and you will have [b'string' b'string'] as output, etc.– Uvar
Nov 13 '18 at 10:33
@Uvar, I have clarified my question
– Alex Cohn
Nov 13 '18 at 10:55
@Uvar, I have clarified my question
– Alex Cohn
Nov 13 '18 at 10:55
It looks like, session.py has a class
_ElementFetchMapper which has _contraction_fn - a function that is responsible, also, for converting the results to expected types. But _REGISTERED_EXPANSIONS is a constant in tensorflow 1.12.– Alex Cohn
Nov 13 '18 at 16:32
It looks like, session.py has a class
_ElementFetchMapper which has _contraction_fn - a function that is responsible, also, for converting the results to expected types. But _REGISTERED_EXPANSIONS is a constant in tensorflow 1.12.– Alex Cohn
Nov 13 '18 at 16:32
add a comment |
1 Answer
1
active
oldest
votes
The problem is that output is a bytes object, because TensorFlow tf.string tensors are indeed made of bytes. But then you are trying to use split with a str separator, and that is why it complains. Try:
output.split(b'i')
or:
output.decode().split('i')
Cool,decode()is nicer than str(..., 'utf-8'). But still, the question was: is there a way to get decoded strings from run(batch)?
– Alex Cohn
Nov 13 '18 at 12:29
@AlexCohn Ah, well, about that, I think not.str(Python 3str, that is) is just not a supported type in TensorFlow. If you look "under the hood",tf.stringis in C++tensorflow::string, which is currently defined asstd::string, i.e. something likebytes. I'm not 100% sure that it is not possible but I don't see where could TensorFlow get the decoded strings from (at least at C++ level, there could be something at Python level implemented for convenience but I haven't seen it).
– jdehesa
Nov 13 '18 at 12:49
I know I can writemap(bytes.decode, sess.run(batch))but I would prefer for this to happen inside run().
– Alex Cohn
Nov 13 '18 at 15:00
@AlexCohn Actually I'm seeing only now there is this functiontf.compat.as_text, which just callsdecodeif the object isbytes. Doesn't seem like they are planning to support it anytime soon... The problem is, some ops really should return bytes (e.g. reading a binary file), and I suppose wrapping some to returnstrbut not others would be inconsistent and hard to maintain. Also it would fix the encoding to e.g. utf8 (unless you implemented options)... And at the same time you want consistent behavior across languages.
– jdehesa
Nov 13 '18 at 15:12
I guess there should be some way to apply tf.map_fn(), only I don't know how
– Alex Cohn
Nov 13 '18 at 15:17
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53278261%2ftensorflow-batch-keep-result-as-strings%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The problem is that output is a bytes object, because TensorFlow tf.string tensors are indeed made of bytes. But then you are trying to use split with a str separator, and that is why it complains. Try:
output.split(b'i')
or:
output.decode().split('i')
Cool,decode()is nicer than str(..., 'utf-8'). But still, the question was: is there a way to get decoded strings from run(batch)?
– Alex Cohn
Nov 13 '18 at 12:29
@AlexCohn Ah, well, about that, I think not.str(Python 3str, that is) is just not a supported type in TensorFlow. If you look "under the hood",tf.stringis in C++tensorflow::string, which is currently defined asstd::string, i.e. something likebytes. I'm not 100% sure that it is not possible but I don't see where could TensorFlow get the decoded strings from (at least at C++ level, there could be something at Python level implemented for convenience but I haven't seen it).
– jdehesa
Nov 13 '18 at 12:49
I know I can writemap(bytes.decode, sess.run(batch))but I would prefer for this to happen inside run().
– Alex Cohn
Nov 13 '18 at 15:00
@AlexCohn Actually I'm seeing only now there is this functiontf.compat.as_text, which just callsdecodeif the object isbytes. Doesn't seem like they are planning to support it anytime soon... The problem is, some ops really should return bytes (e.g. reading a binary file), and I suppose wrapping some to returnstrbut not others would be inconsistent and hard to maintain. Also it would fix the encoding to e.g. utf8 (unless you implemented options)... And at the same time you want consistent behavior across languages.
– jdehesa
Nov 13 '18 at 15:12
I guess there should be some way to apply tf.map_fn(), only I don't know how
– Alex Cohn
Nov 13 '18 at 15:17
add a comment |
The problem is that output is a bytes object, because TensorFlow tf.string tensors are indeed made of bytes. But then you are trying to use split with a str separator, and that is why it complains. Try:
output.split(b'i')
or:
output.decode().split('i')
Cool,decode()is nicer than str(..., 'utf-8'). But still, the question was: is there a way to get decoded strings from run(batch)?
– Alex Cohn
Nov 13 '18 at 12:29
@AlexCohn Ah, well, about that, I think not.str(Python 3str, that is) is just not a supported type in TensorFlow. If you look "under the hood",tf.stringis in C++tensorflow::string, which is currently defined asstd::string, i.e. something likebytes. I'm not 100% sure that it is not possible but I don't see where could TensorFlow get the decoded strings from (at least at C++ level, there could be something at Python level implemented for convenience but I haven't seen it).
– jdehesa
Nov 13 '18 at 12:49
I know I can writemap(bytes.decode, sess.run(batch))but I would prefer for this to happen inside run().
– Alex Cohn
Nov 13 '18 at 15:00
@AlexCohn Actually I'm seeing only now there is this functiontf.compat.as_text, which just callsdecodeif the object isbytes. Doesn't seem like they are planning to support it anytime soon... The problem is, some ops really should return bytes (e.g. reading a binary file), and I suppose wrapping some to returnstrbut not others would be inconsistent and hard to maintain. Also it would fix the encoding to e.g. utf8 (unless you implemented options)... And at the same time you want consistent behavior across languages.
– jdehesa
Nov 13 '18 at 15:12
I guess there should be some way to apply tf.map_fn(), only I don't know how
– Alex Cohn
Nov 13 '18 at 15:17
add a comment |
The problem is that output is a bytes object, because TensorFlow tf.string tensors are indeed made of bytes. But then you are trying to use split with a str separator, and that is why it complains. Try:
output.split(b'i')
or:
output.decode().split('i')
The problem is that output is a bytes object, because TensorFlow tf.string tensors are indeed made of bytes. But then you are trying to use split with a str separator, and that is why it complains. Try:
output.split(b'i')
or:
output.decode().split('i')
answered Nov 13 '18 at 11:24
jdehesajdehesa
23.7k43554
23.7k43554
Cool,decode()is nicer than str(..., 'utf-8'). But still, the question was: is there a way to get decoded strings from run(batch)?
– Alex Cohn
Nov 13 '18 at 12:29
@AlexCohn Ah, well, about that, I think not.str(Python 3str, that is) is just not a supported type in TensorFlow. If you look "under the hood",tf.stringis in C++tensorflow::string, which is currently defined asstd::string, i.e. something likebytes. I'm not 100% sure that it is not possible but I don't see where could TensorFlow get the decoded strings from (at least at C++ level, there could be something at Python level implemented for convenience but I haven't seen it).
– jdehesa
Nov 13 '18 at 12:49
I know I can writemap(bytes.decode, sess.run(batch))but I would prefer for this to happen inside run().
– Alex Cohn
Nov 13 '18 at 15:00
@AlexCohn Actually I'm seeing only now there is this functiontf.compat.as_text, which just callsdecodeif the object isbytes. Doesn't seem like they are planning to support it anytime soon... The problem is, some ops really should return bytes (e.g. reading a binary file), and I suppose wrapping some to returnstrbut not others would be inconsistent and hard to maintain. Also it would fix the encoding to e.g. utf8 (unless you implemented options)... And at the same time you want consistent behavior across languages.
– jdehesa
Nov 13 '18 at 15:12
I guess there should be some way to apply tf.map_fn(), only I don't know how
– Alex Cohn
Nov 13 '18 at 15:17
add a comment |
Cool,decode()is nicer than str(..., 'utf-8'). But still, the question was: is there a way to get decoded strings from run(batch)?
– Alex Cohn
Nov 13 '18 at 12:29
@AlexCohn Ah, well, about that, I think not.str(Python 3str, that is) is just not a supported type in TensorFlow. If you look "under the hood",tf.stringis in C++tensorflow::string, which is currently defined asstd::string, i.e. something likebytes. I'm not 100% sure that it is not possible but I don't see where could TensorFlow get the decoded strings from (at least at C++ level, there could be something at Python level implemented for convenience but I haven't seen it).
– jdehesa
Nov 13 '18 at 12:49
I know I can writemap(bytes.decode, sess.run(batch))but I would prefer for this to happen inside run().
– Alex Cohn
Nov 13 '18 at 15:00
@AlexCohn Actually I'm seeing only now there is this functiontf.compat.as_text, which just callsdecodeif the object isbytes. Doesn't seem like they are planning to support it anytime soon... The problem is, some ops really should return bytes (e.g. reading a binary file), and I suppose wrapping some to returnstrbut not others would be inconsistent and hard to maintain. Also it would fix the encoding to e.g. utf8 (unless you implemented options)... And at the same time you want consistent behavior across languages.
– jdehesa
Nov 13 '18 at 15:12
I guess there should be some way to apply tf.map_fn(), only I don't know how
– Alex Cohn
Nov 13 '18 at 15:17
Cool,
decode() is nicer than str(..., 'utf-8'). But still, the question was: is there a way to get decoded strings from run(batch)?– Alex Cohn
Nov 13 '18 at 12:29
Cool,
decode() is nicer than str(..., 'utf-8'). But still, the question was: is there a way to get decoded strings from run(batch)?– Alex Cohn
Nov 13 '18 at 12:29
@AlexCohn Ah, well, about that, I think not.
str (Python 3 str, that is) is just not a supported type in TensorFlow. If you look "under the hood", tf.string is in C++ tensorflow::string, which is currently defined as std::string, i.e. something like bytes. I'm not 100% sure that it is not possible but I don't see where could TensorFlow get the decoded strings from (at least at C++ level, there could be something at Python level implemented for convenience but I haven't seen it).– jdehesa
Nov 13 '18 at 12:49
@AlexCohn Ah, well, about that, I think not.
str (Python 3 str, that is) is just not a supported type in TensorFlow. If you look "under the hood", tf.string is in C++ tensorflow::string, which is currently defined as std::string, i.e. something like bytes. I'm not 100% sure that it is not possible but I don't see where could TensorFlow get the decoded strings from (at least at C++ level, there could be something at Python level implemented for convenience but I haven't seen it).– jdehesa
Nov 13 '18 at 12:49
I know I can write
map(bytes.decode, sess.run(batch)) but I would prefer for this to happen inside run().– Alex Cohn
Nov 13 '18 at 15:00
I know I can write
map(bytes.decode, sess.run(batch)) but I would prefer for this to happen inside run().– Alex Cohn
Nov 13 '18 at 15:00
@AlexCohn Actually I'm seeing only now there is this function
tf.compat.as_text, which just calls decode if the object is bytes. Doesn't seem like they are planning to support it anytime soon... The problem is, some ops really should return bytes (e.g. reading a binary file), and I suppose wrapping some to return str but not others would be inconsistent and hard to maintain. Also it would fix the encoding to e.g. utf8 (unless you implemented options)... And at the same time you want consistent behavior across languages.– jdehesa
Nov 13 '18 at 15:12
@AlexCohn Actually I'm seeing only now there is this function
tf.compat.as_text, which just calls decode if the object is bytes. Doesn't seem like they are planning to support it anytime soon... The problem is, some ops really should return bytes (e.g. reading a binary file), and I suppose wrapping some to return str but not others would be inconsistent and hard to maintain. Also it would fix the encoding to e.g. utf8 (unless you implemented options)... And at the same time you want consistent behavior across languages.– jdehesa
Nov 13 '18 at 15:12
I guess there should be some way to apply tf.map_fn(), only I don't know how
– Alex Cohn
Nov 13 '18 at 15:17
I guess there should be some way to apply tf.map_fn(), only I don't know how
– Alex Cohn
Nov 13 '18 at 15:17
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53278261%2ftensorflow-batch-keep-result-as-strings%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I am not sure I understand the question/problem. The result is a list of strings, just of length one in this case, due to the definition of
batch. Make itbatch_size=2and you will have[b'string' b'string']as output, etc.– Uvar
Nov 13 '18 at 10:33
@Uvar, I have clarified my question
– Alex Cohn
Nov 13 '18 at 10:55
It looks like, session.py has a class
_ElementFetchMapperwhich has_contraction_fn- a function that is responsible, also, for converting the results to expected types. But_REGISTERED_EXPANSIONSis a constant in tensorflow 1.12.– Alex Cohn
Nov 13 '18 at 16:32