Count the amount of times a word is shown in a string
up vote
0
down vote
favorite
I have a big string where I need to
Convert words starting with an upper-case to a lower-case word so that all words are lower case.
Sort the amount of times a word is shown
- A word in this sense is a sequence of characters without whitespaces
or punctuation (!#= etc)
- A word in this sense is a sequence of characters without whitespaces
Sort from most frequent word shown and to less frequent word.
I've made a function to read a .txt file and turn it into a string.
But I'm not sure where to go from here and any kind of help with any of the bulletpoints would be greatly appreciated.
f#
add a comment |
up vote
0
down vote
favorite
I have a big string where I need to
Convert words starting with an upper-case to a lower-case word so that all words are lower case.
Sort the amount of times a word is shown
- A word in this sense is a sequence of characters without whitespaces
or punctuation (!#= etc)
- A word in this sense is a sequence of characters without whitespaces
Sort from most frequent word shown and to less frequent word.
I've made a function to read a .txt file and turn it into a string.
But I'm not sure where to go from here and any kind of help with any of the bulletpoints would be greatly appreciated.
f#
2
I didn't downvote, but the person who did, most likely did so because you're asking for someone to write some code without showing what you've tried. As it appears you're new to F#, I've given an answer that explains one way you would solve this problem thinking in a functional way. Hopefully this is a springboard to get you going with F# and solving problems. As F# runs on .NET, you can get inspiration from C# sometimes (like the Punctuation code).
– DaveShaw
Nov 9 at 21:32
1
Ah, okay. I should've said that. I already did try, I was able to split the string so it printed words instead of chars but I thought it would make my post my confusing writing my ugly code. I was also trying to use Regex.Escape to atleast count the occurances of a word, was looking at this: stackoverflow.com/questions/40385154/…
– jubibanna
Nov 9 at 21:59
1
It always helps to post what you have... That code is for when you know what you're looking for and want to count. It could be adopted to solve your problem, but I'd always prefer a groupBy to get a count, if you want to count all instances.
– DaveShaw
Nov 9 at 23:59
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a big string where I need to
Convert words starting with an upper-case to a lower-case word so that all words are lower case.
Sort the amount of times a word is shown
- A word in this sense is a sequence of characters without whitespaces
or punctuation (!#= etc)
- A word in this sense is a sequence of characters without whitespaces
Sort from most frequent word shown and to less frequent word.
I've made a function to read a .txt file and turn it into a string.
But I'm not sure where to go from here and any kind of help with any of the bulletpoints would be greatly appreciated.
f#
I have a big string where I need to
Convert words starting with an upper-case to a lower-case word so that all words are lower case.
Sort the amount of times a word is shown
- A word in this sense is a sequence of characters without whitespaces
or punctuation (!#= etc)
- A word in this sense is a sequence of characters without whitespaces
Sort from most frequent word shown and to less frequent word.
I've made a function to read a .txt file and turn it into a string.
But I'm not sure where to go from here and any kind of help with any of the bulletpoints would be greatly appreciated.
f#
f#
edited Nov 9 at 20:25
asked Nov 9 at 20:13
jubibanna
3088
3088
2
I didn't downvote, but the person who did, most likely did so because you're asking for someone to write some code without showing what you've tried. As it appears you're new to F#, I've given an answer that explains one way you would solve this problem thinking in a functional way. Hopefully this is a springboard to get you going with F# and solving problems. As F# runs on .NET, you can get inspiration from C# sometimes (like the Punctuation code).
– DaveShaw
Nov 9 at 21:32
1
Ah, okay. I should've said that. I already did try, I was able to split the string so it printed words instead of chars but I thought it would make my post my confusing writing my ugly code. I was also trying to use Regex.Escape to atleast count the occurances of a word, was looking at this: stackoverflow.com/questions/40385154/…
– jubibanna
Nov 9 at 21:59
1
It always helps to post what you have... That code is for when you know what you're looking for and want to count. It could be adopted to solve your problem, but I'd always prefer a groupBy to get a count, if you want to count all instances.
– DaveShaw
Nov 9 at 23:59
add a comment |
2
I didn't downvote, but the person who did, most likely did so because you're asking for someone to write some code without showing what you've tried. As it appears you're new to F#, I've given an answer that explains one way you would solve this problem thinking in a functional way. Hopefully this is a springboard to get you going with F# and solving problems. As F# runs on .NET, you can get inspiration from C# sometimes (like the Punctuation code).
– DaveShaw
Nov 9 at 21:32
1
Ah, okay. I should've said that. I already did try, I was able to split the string so it printed words instead of chars but I thought it would make my post my confusing writing my ugly code. I was also trying to use Regex.Escape to atleast count the occurances of a word, was looking at this: stackoverflow.com/questions/40385154/…
– jubibanna
Nov 9 at 21:59
1
It always helps to post what you have... That code is for when you know what you're looking for and want to count. It could be adopted to solve your problem, but I'd always prefer a groupBy to get a count, if you want to count all instances.
– DaveShaw
Nov 9 at 23:59
2
2
I didn't downvote, but the person who did, most likely did so because you're asking for someone to write some code without showing what you've tried. As it appears you're new to F#, I've given an answer that explains one way you would solve this problem thinking in a functional way. Hopefully this is a springboard to get you going with F# and solving problems. As F# runs on .NET, you can get inspiration from C# sometimes (like the Punctuation code).
– DaveShaw
Nov 9 at 21:32
I didn't downvote, but the person who did, most likely did so because you're asking for someone to write some code without showing what you've tried. As it appears you're new to F#, I've given an answer that explains one way you would solve this problem thinking in a functional way. Hopefully this is a springboard to get you going with F# and solving problems. As F# runs on .NET, you can get inspiration from C# sometimes (like the Punctuation code).
– DaveShaw
Nov 9 at 21:32
1
1
Ah, okay. I should've said that. I already did try, I was able to split the string so it printed words instead of chars but I thought it would make my post my confusing writing my ugly code. I was also trying to use Regex.Escape to atleast count the occurances of a word, was looking at this: stackoverflow.com/questions/40385154/…
– jubibanna
Nov 9 at 21:59
Ah, okay. I should've said that. I already did try, I was able to split the string so it printed words instead of chars but I thought it would make my post my confusing writing my ugly code. I was also trying to use Regex.Escape to atleast count the occurances of a word, was looking at this: stackoverflow.com/questions/40385154/…
– jubibanna
Nov 9 at 21:59
1
1
It always helps to post what you have... That code is for when you know what you're looking for and want to count. It could be adopted to solve your problem, but I'd always prefer a groupBy to get a count, if you want to count all instances.
– DaveShaw
Nov 9 at 23:59
It always helps to post what you have... That code is for when you know what you're looking for and want to count. It could be adopted to solve your problem, but I'd always prefer a groupBy to get a count, if you want to count all instances.
– DaveShaw
Nov 9 at 23:59
add a comment |
2 Answers
2
active
oldest
votes
up vote
3
down vote
accepted
Let's go through this step by step then, creating a function for each bit:
Convert words starting with an upper-case to a lower-case word so that all words are lower case.
Split the string into a sequence of words:
let getWords (s: string) =
s.Split(' ')
Turns "hello world" into ["hello"; "world"]
Sort the amount of times a word is shown. A word in this sense is a sequence of characters without whitespaces or punctuation (!#= etc)
Part #1: Format a word in lower without punctuation:
let isNotPunctuation c =
not (Char.IsPunctuation(c))
let formatWord (s: string) =
let chars =
s.ToLowerInvariant()
|> Seq.filter isNotPunctuation
|> Seq.toArray
new String(chars)
Turns "Hello!" into "hello".
Part #2: Group the list of words by the formatted version of it.
let groupWords (words: string seq) =
words
|> Seq.groupBy formatWord
This returns a tuple, with the first part as the key (formatWord
) the second part is a list of the words.
Turns ["hello"; "world"; "hello"]
into
[("hello", ["hello"; "hello"]);
("world", ["world"])]
Sort from most frequent word shown and to less frequent word.
let sortWords group =
group
|> Seq.sortByDescending (fun g -> Seq.length (snd g))
Sort the list descending (biggest first) by the length
(count) of items in the second part - see the above representation.
Now we just need to clean up the output:
let output group =
group
|> Seq.map fst
This picks the first part of the tuple from the group:
Turns ("hello", ["hello"; "hello"])
into "hello".
Now we have all the functions, we can stick them together into one chain:
let s = "some long string with some repeated words again and some other words"
let finished =
s
|> getWords
|> groupWords
|> sortWords
|> output
printfn "%A" finished
//seq ["some"; "words"; "long"; "string"; ...]
Wow! Thank you so much Dave, I actually learned quite a lot here! There's a lot I didn't know here so I will read your answer a couple of times again.
– jubibanna
Nov 9 at 21:38
add a comment |
up vote
1
down vote
Here's another way using Regex
open System.Text.RegularExpressions
let str = "Some (very) long string with some repeated words again, and some other words, and some punctuation too."
str
|> (Regex @"W+").Split
|> Seq.choose(fun s -> if s = "" then None else Some (s.ToLower()))
|> Seq.countBy id
|> Seq.sortByDescending snd
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
Let's go through this step by step then, creating a function for each bit:
Convert words starting with an upper-case to a lower-case word so that all words are lower case.
Split the string into a sequence of words:
let getWords (s: string) =
s.Split(' ')
Turns "hello world" into ["hello"; "world"]
Sort the amount of times a word is shown. A word in this sense is a sequence of characters without whitespaces or punctuation (!#= etc)
Part #1: Format a word in lower without punctuation:
let isNotPunctuation c =
not (Char.IsPunctuation(c))
let formatWord (s: string) =
let chars =
s.ToLowerInvariant()
|> Seq.filter isNotPunctuation
|> Seq.toArray
new String(chars)
Turns "Hello!" into "hello".
Part #2: Group the list of words by the formatted version of it.
let groupWords (words: string seq) =
words
|> Seq.groupBy formatWord
This returns a tuple, with the first part as the key (formatWord
) the second part is a list of the words.
Turns ["hello"; "world"; "hello"]
into
[("hello", ["hello"; "hello"]);
("world", ["world"])]
Sort from most frequent word shown and to less frequent word.
let sortWords group =
group
|> Seq.sortByDescending (fun g -> Seq.length (snd g))
Sort the list descending (biggest first) by the length
(count) of items in the second part - see the above representation.
Now we just need to clean up the output:
let output group =
group
|> Seq.map fst
This picks the first part of the tuple from the group:
Turns ("hello", ["hello"; "hello"])
into "hello".
Now we have all the functions, we can stick them together into one chain:
let s = "some long string with some repeated words again and some other words"
let finished =
s
|> getWords
|> groupWords
|> sortWords
|> output
printfn "%A" finished
//seq ["some"; "words"; "long"; "string"; ...]
Wow! Thank you so much Dave, I actually learned quite a lot here! There's a lot I didn't know here so I will read your answer a couple of times again.
– jubibanna
Nov 9 at 21:38
add a comment |
up vote
3
down vote
accepted
Let's go through this step by step then, creating a function for each bit:
Convert words starting with an upper-case to a lower-case word so that all words are lower case.
Split the string into a sequence of words:
let getWords (s: string) =
s.Split(' ')
Turns "hello world" into ["hello"; "world"]
Sort the amount of times a word is shown. A word in this sense is a sequence of characters without whitespaces or punctuation (!#= etc)
Part #1: Format a word in lower without punctuation:
let isNotPunctuation c =
not (Char.IsPunctuation(c))
let formatWord (s: string) =
let chars =
s.ToLowerInvariant()
|> Seq.filter isNotPunctuation
|> Seq.toArray
new String(chars)
Turns "Hello!" into "hello".
Part #2: Group the list of words by the formatted version of it.
let groupWords (words: string seq) =
words
|> Seq.groupBy formatWord
This returns a tuple, with the first part as the key (formatWord
) the second part is a list of the words.
Turns ["hello"; "world"; "hello"]
into
[("hello", ["hello"; "hello"]);
("world", ["world"])]
Sort from most frequent word shown and to less frequent word.
let sortWords group =
group
|> Seq.sortByDescending (fun g -> Seq.length (snd g))
Sort the list descending (biggest first) by the length
(count) of items in the second part - see the above representation.
Now we just need to clean up the output:
let output group =
group
|> Seq.map fst
This picks the first part of the tuple from the group:
Turns ("hello", ["hello"; "hello"])
into "hello".
Now we have all the functions, we can stick them together into one chain:
let s = "some long string with some repeated words again and some other words"
let finished =
s
|> getWords
|> groupWords
|> sortWords
|> output
printfn "%A" finished
//seq ["some"; "words"; "long"; "string"; ...]
Wow! Thank you so much Dave, I actually learned quite a lot here! There's a lot I didn't know here so I will read your answer a couple of times again.
– jubibanna
Nov 9 at 21:38
add a comment |
up vote
3
down vote
accepted
up vote
3
down vote
accepted
Let's go through this step by step then, creating a function for each bit:
Convert words starting with an upper-case to a lower-case word so that all words are lower case.
Split the string into a sequence of words:
let getWords (s: string) =
s.Split(' ')
Turns "hello world" into ["hello"; "world"]
Sort the amount of times a word is shown. A word in this sense is a sequence of characters without whitespaces or punctuation (!#= etc)
Part #1: Format a word in lower without punctuation:
let isNotPunctuation c =
not (Char.IsPunctuation(c))
let formatWord (s: string) =
let chars =
s.ToLowerInvariant()
|> Seq.filter isNotPunctuation
|> Seq.toArray
new String(chars)
Turns "Hello!" into "hello".
Part #2: Group the list of words by the formatted version of it.
let groupWords (words: string seq) =
words
|> Seq.groupBy formatWord
This returns a tuple, with the first part as the key (formatWord
) the second part is a list of the words.
Turns ["hello"; "world"; "hello"]
into
[("hello", ["hello"; "hello"]);
("world", ["world"])]
Sort from most frequent word shown and to less frequent word.
let sortWords group =
group
|> Seq.sortByDescending (fun g -> Seq.length (snd g))
Sort the list descending (biggest first) by the length
(count) of items in the second part - see the above representation.
Now we just need to clean up the output:
let output group =
group
|> Seq.map fst
This picks the first part of the tuple from the group:
Turns ("hello", ["hello"; "hello"])
into "hello".
Now we have all the functions, we can stick them together into one chain:
let s = "some long string with some repeated words again and some other words"
let finished =
s
|> getWords
|> groupWords
|> sortWords
|> output
printfn "%A" finished
//seq ["some"; "words"; "long"; "string"; ...]
Let's go through this step by step then, creating a function for each bit:
Convert words starting with an upper-case to a lower-case word so that all words are lower case.
Split the string into a sequence of words:
let getWords (s: string) =
s.Split(' ')
Turns "hello world" into ["hello"; "world"]
Sort the amount of times a word is shown. A word in this sense is a sequence of characters without whitespaces or punctuation (!#= etc)
Part #1: Format a word in lower without punctuation:
let isNotPunctuation c =
not (Char.IsPunctuation(c))
let formatWord (s: string) =
let chars =
s.ToLowerInvariant()
|> Seq.filter isNotPunctuation
|> Seq.toArray
new String(chars)
Turns "Hello!" into "hello".
Part #2: Group the list of words by the formatted version of it.
let groupWords (words: string seq) =
words
|> Seq.groupBy formatWord
This returns a tuple, with the first part as the key (formatWord
) the second part is a list of the words.
Turns ["hello"; "world"; "hello"]
into
[("hello", ["hello"; "hello"]);
("world", ["world"])]
Sort from most frequent word shown and to less frequent word.
let sortWords group =
group
|> Seq.sortByDescending (fun g -> Seq.length (snd g))
Sort the list descending (biggest first) by the length
(count) of items in the second part - see the above representation.
Now we just need to clean up the output:
let output group =
group
|> Seq.map fst
This picks the first part of the tuple from the group:
Turns ("hello", ["hello"; "hello"])
into "hello".
Now we have all the functions, we can stick them together into one chain:
let s = "some long string with some repeated words again and some other words"
let finished =
s
|> getWords
|> groupWords
|> sortWords
|> output
printfn "%A" finished
//seq ["some"; "words"; "long"; "string"; ...]
answered Nov 9 at 20:51
DaveShaw
39.3k1088124
39.3k1088124
Wow! Thank you so much Dave, I actually learned quite a lot here! There's a lot I didn't know here so I will read your answer a couple of times again.
– jubibanna
Nov 9 at 21:38
add a comment |
Wow! Thank you so much Dave, I actually learned quite a lot here! There's a lot I didn't know here so I will read your answer a couple of times again.
– jubibanna
Nov 9 at 21:38
Wow! Thank you so much Dave, I actually learned quite a lot here! There's a lot I didn't know here so I will read your answer a couple of times again.
– jubibanna
Nov 9 at 21:38
Wow! Thank you so much Dave, I actually learned quite a lot here! There's a lot I didn't know here so I will read your answer a couple of times again.
– jubibanna
Nov 9 at 21:38
add a comment |
up vote
1
down vote
Here's another way using Regex
open System.Text.RegularExpressions
let str = "Some (very) long string with some repeated words again, and some other words, and some punctuation too."
str
|> (Regex @"W+").Split
|> Seq.choose(fun s -> if s = "" then None else Some (s.ToLower()))
|> Seq.countBy id
|> Seq.sortByDescending snd
add a comment |
up vote
1
down vote
Here's another way using Regex
open System.Text.RegularExpressions
let str = "Some (very) long string with some repeated words again, and some other words, and some punctuation too."
str
|> (Regex @"W+").Split
|> Seq.choose(fun s -> if s = "" then None else Some (s.ToLower()))
|> Seq.countBy id
|> Seq.sortByDescending snd
add a comment |
up vote
1
down vote
up vote
1
down vote
Here's another way using Regex
open System.Text.RegularExpressions
let str = "Some (very) long string with some repeated words again, and some other words, and some punctuation too."
str
|> (Regex @"W+").Split
|> Seq.choose(fun s -> if s = "" then None else Some (s.ToLower()))
|> Seq.countBy id
|> Seq.sortByDescending snd
Here's another way using Regex
open System.Text.RegularExpressions
let str = "Some (very) long string with some repeated words again, and some other words, and some punctuation too."
str
|> (Regex @"W+").Split
|> Seq.choose(fun s -> if s = "" then None else Some (s.ToLower()))
|> Seq.countBy id
|> Seq.sortByDescending snd
answered Nov 10 at 7:56
gileCAD
1,28656
1,28656
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53232716%2fcount-the-amount-of-times-a-word-is-shown-in-a-string%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
I didn't downvote, but the person who did, most likely did so because you're asking for someone to write some code without showing what you've tried. As it appears you're new to F#, I've given an answer that explains one way you would solve this problem thinking in a functional way. Hopefully this is a springboard to get you going with F# and solving problems. As F# runs on .NET, you can get inspiration from C# sometimes (like the Punctuation code).
– DaveShaw
Nov 9 at 21:32
1
Ah, okay. I should've said that. I already did try, I was able to split the string so it printed words instead of chars but I thought it would make my post my confusing writing my ugly code. I was also trying to use Regex.Escape to atleast count the occurances of a word, was looking at this: stackoverflow.com/questions/40385154/…
– jubibanna
Nov 9 at 21:59
1
It always helps to post what you have... That code is for when you know what you're looking for and want to count. It could be adopted to solve your problem, but I'd always prefer a groupBy to get a count, if you want to count all instances.
– DaveShaw
Nov 9 at 23:59