This is a companion discussion topic for the original entry at https://thoughtbot.com/upcase/videos/extract-class
You said there is an exercise for us, is that true?
How often do you guys parse CSV in your projects?
I know it was a simple example in this video but Iām very interested in approaches you take to avoid CSV parsing errors like encoding, col/row separator, quote_char, etc.
My current project has a feature where our Rails app parses uploaded CSV file, the file is most of the times a conversion from an excel/numbers spreadsheet. Iām on my 3rd rewrite/refactor/revision of my parser class attempt but it feels like I still miss some small gotchas with encoding and quotation.
Any tips?
Great video, as always! I had a question on the forums a couple of weeks ago with a strikingly similar example, so this was perfect timing to help me work through that question and you guys did an excellent job of easing my concerns and explaining your approach.
I have historically been afraid to remove or alter the higher level tests after this type of refactoring, but seeing you and Joe keep the tests around until they are back to green, and then changing the test to use doubles to isolate it makes perfect sense. Joe also alluded to the potential changing the constructor of the original class to take an instance of the new class as a further refactoring, so hopefully we can look forward to more videos with more details around that type of inversion of control.
Another great video guys, thanks a lot. Really glad you mentioned the issue of dependencies towards the end, and how you could simply pass a Parser
into Sender
instead of the CSV data. I think that makes total sense, but for me, the Sender
is not really expecting a parser object, but more an object that plays the role of a RecipientList
(or similar) which implements the the :recipients
method.
That way, it just expects an object to give it an array of hashes. Then we could have CSVParser
or JSONParser
or whatever, that plays the RecipientList
role. What do you think? (And just to say, this line of thinking is totally from just watching you guys in action over the last few weeks with this series!)
Probably Iāve been asking this for a long time now. But doesnāt adding spec_helper
implies that you are not isolating that particular test from Rails? In this particular example do you actually need to load Rails to get this test passing? Itās seems that you really donāt need it. Can you elaborate on that? Great video though; I probably will keep adding questions because I spend the whole week just trying to practice what you say in there.
Thanks again,
While It may be loading probably more from the stack. The slow thing is when you start hitting the database. So loading spec_helper
doesnāt imply or avoid.
Also, with the new changes in rails, like adding Spring, most of the pain from loading the whole stack may be ignored, so keeping the spec_helper, is more like avoiding to manually load things like each of the files tested or libraries like factory girl.
@Cicloid Even in that case, I think that loading factoryGirl is something that for this particular refactoring is not needed. Also I prefer to just add the exact dependencies that I need if something on the performance run go wrong I will be able to see it then without traveling in the entire list of things that Rails loaded from the spec_helper
but then again thatās my approach at the moment.
@jferris Any thoughts on this CSV parsing question?
HI Alex, have you thought about using the roo
or importeroo
gems? Especially if youāre working from Excel, it would probably be helpful to let that gem do some of the lifting for you.
@alexbush we run into CSV parsing and generating pretty frequently. Iāve used the built-in CSV support from Ruby for the past few years, and itās held up pretty well. There are a few things I find helpful when dealing with it:
- Use a dedicated parser class that encapsulates CSV parsing. This also serves as a good place to collect errors.
CSV.parse
handles parse errors, but you can make them prettier. - Avoid calling
to_csv
or anything like that on collections. It inevitably results in trying to generate CSV for a huge collection, resulting in memory bloat. Do it row-by-row instead. - Try to do both parsing and generating in background jobs, as large CSV files will slow down your HTML requests.
If there are any specific examples or questions, feel free to start a new thread on the subject.
@danscotton I agree - if you inject the dependency, you can totally hide the fact that parsing is going on altogether. I would probably make the Parser
implement the Enumerable
interface and then just accept a recipients
parameter in initialize
. Then, you can pass an Array
, a Parser
, or the results from a database query, and Sender
doesnāt have to change at all, even semantically.
@Rafael_George This particular example actually doesnāt actually use Rails! The spec_helper.rb
I was using in this example just loaded rspec
, active_support
, and files from lib/*.rb
.
In general, I havenāt found a lot of benefit from having separate spec_helper.rb
files, even in Rails projects. There are two potential benefits to a slimmer spec_helper.rb
:
- Speed: loading Rails and its dependencies takes a while. However, using something like spring largely removes this concern.
- Isolation: loading Rails makes it harder to examine your dependencies. Unfortunately, just having ActiveSupport (necessary for most framework classes like models, views, controllers, or mailers) is enough to bring auto loading into the picture. Because Ruby doesnāt isolate files from each other - a file
require
d from one location is available everywhere - and because ActiveSupport automatically tries to load dependencies anyway, I find that you really canāt have dependency isolation in a Rails project. Attempting to do so just provides a false sense of security, as thereās no way to enforce the isolation - you never know where your dependencies are coming from.
@jferris Normally when I see spec_helper
I tend to think thatās loading Rails stuffs. In another note what Iām doing is just to be explicit about my dependencies requiring them one by one; also I tend to use rspec instead of rake spec
and use the .rspec
file to set which directories I want rspec to check. Iām talking from the point of view when Iām testing just unit after that I normally tend to autoload everything for my integration tests which are fewer. But I see your point regarding you donāt know where the magic is coming from. Thanks for your reply.
Another great episode, I have two questions:
- Do you prefer to have attr_reader or access instance variables directly like
@parser
and@message
inSender
class? - After it refactored, do you usually remove private method
recipients
and access it directly@parser.recipients
insend
method?
I donāt think it matters much. There are pros and cons to each approach, but the differences are fairly small.
Nope. That method is there (and private) because it helps hide the details of how we access the recipients. Moving it into the send
method would be adding unnecessary detail into send
.
Yep! Itās available on the dashboard, or here: https://github.com/thoughtbot/extract-class-exercise
@benorenstein Can we talk publicly about this ex. Iām talking about blogs and stuffs? Because itās private are you already know.
Absolutely.
@jferris This article came in good timing spec āHelperā what do you think?
When we extract private methods to a new class, donāt we think itās just a implementation detail? So their behavior should be tested by their higher level?
I wrote a blog post regarding this ex. Would guys mind on give me some feedback I will really appreciate it.
Another refactoring story
Thanks