tag:blogger.com,1999:blog-8344907271756111127.post4831038594081579249..comments2019-03-17T10:54:55.647-07:00Comments on I Lessen Data: Python sentence segmentation, kind of quick and mostly legitDanhttp://www.blogger.com/profile/03312048754374286109noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-8344907271756111127.post-32548842318835695252015-11-07T20:20:27.228-08:002015-11-07T20:20:27.228-08:00Nah, sorry. Well, you could just split on periods ...Nah, sorry. Well, you could just split on periods and exclamation points, if you don't have to be very accurate (this will be pretty bad, but the best dead simple no library solution I can think of)Danhttps://www.blogger.com/profile/03312048754374286109noreply@blogger.comtag:blogger.com,1999:blog-8344907271756111127.post-50490034151806378102015-11-07T16:15:54.672-08:002015-11-07T16:15:54.672-08:00Thanks for the quick reply Dan. Was hoping to do t...Thanks for the quick reply Dan. Was hoping to do this task without importing any external libraries.<br /><br />Any good tips for doing so (apart from a bunch of regex which drive me nuts)? Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8344907271756111127.post-33687961149001435362015-11-07T16:10:36.151-08:002015-11-07T16:10:36.151-08:00Guess you do have to install nltk. Looks like you ...Guess you do have to install nltk. Looks like you can just do it with pip now though. At a terminal:<br />pip install nltk<br />Then use the code above.Danhttps://www.blogger.com/profile/03312048754374286109noreply@blogger.comtag:blogger.com,1999:blog-8344907271756111127.post-61260837202854284442015-11-07T16:05:45.677-08:002015-11-07T16:05:45.677-08:00This doesn't seem to work, the Unpickler throw...This doesn't seem to work, the Unpickler throws an ImportError (ImportError: No module named nltk.tokenize.punkt). <br /><br />Any way to get around this without having the entire NLTK overhead?Anonymousnoreply@blogger.com