First, a clarification: In this post (and context), by saying document I don't mean documents like Software Requirements Specification (SRS) or Software Design Document (SDD). I mean comments written to document your code like javadoc (Java), docstring (Python) etc.
def sort(numbers):
“““Given list of numbers in any order, returns a naturally
ordered list that is a[i] <= a[j] for every i,j i < j”””
# not implemented
Okay, this may be too detailed comment for widely know sorting function, but this is just an example, so do not focus on example but focus on what is meant by it. This comment tells what you expect and what will be the result of running this function.
Writing docstrings first has these benefits:
You have clearly defined what your function is supposed to do
As a result of 1, you can write tests more easily
You have already written documentation.
Next, you can write actual code, then tests since you know clearly what you are going to code. Writing docstring first has also another benefit: If your docstring is becoming too long or hard to write or it seems like you are telling two stories in it, then it is clear that your function is not that much cohesive and it needs to be split.
These are not new ideas. I think in these [1], similar things are already explained: Design by contract, if your function's name says too much, it is not cohesive, if it is hard to explain it may not be a good idea to implement. In fact, it seems that there are even some projects using this (I saw them after writing the post).
When you research these process / methodologies, you will find many proponents and opponents of them with legitimate reasons. I think the best way to learn which one works for you ( TDD, DDD, ?DD [? stands for anything] ), try them for some amount of time and measure.
[1] Pragmatic Programmer, Clean Code, The Zen of Python and many other books, articles
Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality
Anya is LIVE right now
FREE
Free to watch • No registration required • HD streaming
Recommender systems (RS) are one of the hot topics in both academy and industry today. The aim of the RS is to recommend new items to the users based on their historical data. This is a very incomplete definition of RS. For a more complety one see: Recommender System.
After applying SVD and reductions, the most similar user to U1 is U2. 0 (zero) means the user hasn't rated that movie yet. I called it "ratings matrix at t2" since it is the current ratings matrix and t2 > t1. Based on this information, we would predict that U1 may give 4 to movie M5. Then lets look at the same matrix at time t1.
Please note that we have more zeros in this matrix since it corresponds to an earlier time where some of users haven't rated the movies they rated at time t2 yet. After same operations, we find that most similar user to U1 is U3. So in this case we would predict that U1 may give 1 to movie M5.
So past and present give very different results. In the past U3 was most similar to U1 and now U2 is most similar to it. But if U1 and U3 was similar in the past they may become similar again in the future. So wouldn't it be better if we knew this now? Okay, we cannot know the future (if we knew, we wouldn't need to bother with these methods for recommendation) but we can rely on the information in the past. So my suggestion is that in RS we can use different matrices whose snapshots are taken in various times for recommendation. For our example, we can use both rating matrix at t1 and t2 so that we can predict the rating of U1 for M5 by averaging both results (4 + 1) / 2 = 2.5 or we may think that past should contribute less to the prediction so we reduce its effect: 4 * 0.6 + 1 * 0.4 = 2.8.
For now I I couldn't test my method since the data I use (GroupLens data: http://www.grouplens.org/node/12) do not permit it. Although it has timestamps, I think since they ask immediate 15 movie ratings after registration, timestamps are too close to each other so having different snapshots is a little bit difficult.
On the other hand, this method has 2 bottlenecks that comes to my mind immediately:
1. How to take snapshots? Based on what criteria we will take snapshots? How much will be the time differences?
2. Since we have more matrices, it will degrade the performance. Scalibilty issues. Also space problem to store these matrices.
In this post, I will try to implement Observer Pattern [GoF] using Python decorators. For those who do not know what the Observer Pattern is here is an incomplete summary. Observer pattern is a design pattern to handle change and notification mechanism for an changing object and other objects that are interested in these changes. The most-widely used example for this pattern is this: A spreadsheet, pie chart, bar chart etc, are derived from same data and we want to ensure that all of them are up to date when we change the underlyingg data.
# the data everyone observes data = [] # the functions that are interested in data and change of data. observers = [] def average(): "Prints the average of the numbers stored in data" print("The average: %f" % (sum(data) / len(data),)) def minimum(): "Prints the minimum stored in data" print("The minimum %d" % min(data)) def maximum(): "Prints the maximum stored in data" print("The maximum %d" % max(data)) def observer_decorator(subscribers): "This is where observers are stored and notified after changes" def wrap(f): "Decorate the function to notify everyone after each change" def wrapped(*args, **kwargs): f(*args, **kwargs) for func in subscribers: # tell everyone about change func() return wrapped return wrap @observer_decorator(observers) def insert(element): "Insert and element to data" data.append(element)
We simply use decorator's argument to store interested parties (namely subscribers). The difference between regular class / object based observer pattern and this is that we don't need to create classes and don't need to have methods with same name. Moreover, we don't need to change the definition of the function that changes the data. Here is the code in action:
Interested parties come and go whenever they want and the ones who stay get the notifications so that they can take necessary actions. OK, this may not be a real notification process since we are actually calling the functions that handles the operations rather that telling them data has changed.
The output of the test is:
The minimum 2 The maximum 9 -------------------------------------------------- The minimum 2 The maximum 11 The average: 6.000000 -------------------------------------------------- The maximum 11 The average: 5.166667
It runs as we expected. We implemented Observer Pattern with Python Decorators.
Note: No claim about correctness of this method. If you find anything wrong here (definitions, codes etc), please inform me.