Random Thoughts @dimitryp - Tumblr Blog

Importance of “putting a stake in the ground” in modern Western society

Many westerners believe that whoever is "first" has the strongest claim. This thought tradition has a long history in military and politics.

Why is it so prevalent? May be:

because it's a simple principle to resolve competing claims

it serves as a building block for some important mechanisms, which are necessary for the continuing survival of the society - and this heuristic survives along with the mechanisms it strengthens

For the latter, it seems plausible that rewarding first-claimer incentivizes risk-taking, and therefore produces higher rewards.

How to tell if the person is a believer in this concept? I see a few examples:

they may suggest that early partners in the venture deserve higher ownership share just for being "early";

that early supporters of the policy/politician deserve to implement the policy and be appointed to more senior positions.

Intuitively this heuristic appears to me to be arbitrary and causing distortions in the long-term.

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

“Better to own it”

Yesterday my good friend was hesitant to share his piano cover with the public, worrying it's not perfect.

I convinced him to do it. He knows what "perfect" is because he looks up to it and it drives him to become better. My argument is: most people don't know what "perfect" is.

Does he want to be a person who is not known as piano player at all? Or does he want to be a person who can play piano imperfectly, yet better than the most?

He agreed to my argunent. "Better to own it", he said.

Why YC “office hours” rock

I am going through Y Combinator's MOOC "Startup School 2018". One of the components of the course is "office hours." It's one-two hours activity every week when founders from the same group get together for a group video chat. It's led by an experienced founder who keeps it structured. Everyone speaks in turns: describes their progress, their metrics, asks questions and says "what's on their mind."

I am behind on video lectures and barely participate in the forums, and I believe that "office hours" is the most important part of the course. I am not the first one to claim so, and YC itself openly states that "office hours" is the critical part of the SUS (Startup School) and also of their in-person SF-based program.

Last night I was thinking about it and here is my guess as to why that is.

Startups are not obvious. On many levels, startup looks like a small business, yet the goal of a startup founder is to build a big company - or fail completely and move on to the next thing. There is no middle ground. To grow big, startup must solve a yet unsolved problem: make something previously impossible - possible; or make something previously hard - easy (❤️ Perl).

Startups are not obvious! The steps you have to go through are not obvious! Most people I know are not startup founders. Most people I know do obvious things! I applaud my friends who work hard, work smart, study, support families, raise kids, party together, pass their experience and knowledge to others. Doing obvious things well is hard, and doing obvious things well is rewarding. The problem is: I can't compare my pursuit to theirs - even if they match my intensity. I can't measure myself against them, because the roads we travel as so different.

In short, I felt lost and wondering if I am on track or if I am even moving! This is unnatural to a human. Evolutionary, one is almost guaranteed to survive and progress if he does what others around him do - and the reason why you don't see people doing random-thing-X is likely because random-thing-X kills.

Being a product of human evolution I experienced almost physical resistance doing something that nobody around me does. "Office hours" gave me a group of real living breathing peers who are building startups and who aren't killed doing that. Building a startup is normal, I can physically feel it now.

Relative success is only a mild indicator of a superior epistemic or instrumental rationality. The habit of rushing to self-update after observing achievements higher than mine must be stopped. Friends, please keep reminding me about this!

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

New GPG/PGP subkeys!

Of Microservices and Unit-Tests

Microservices is my favorite methodology (not technology! but methodology) these days: allows to move fast even if the team is young, heterogeneous and opinionated. How? Easy: each microservice is a black box, and it doesn't matter what's inside - it still can be tested! (black-box part is important, because it affords the org to hire younger developers and give them the freedom to choose the language and framework and programming model, if any)

TDD is a well-known technique enabling fast creation of high-quality software. With black-box microservices, tests can be created after the fact, and tests survive complete rewrites of the black-box (which are inevitable, in high-turnover orgs).

Practicing this philosophy for the last few years gave good results and put me into the habit of dismissing unit-tests. Indeed, I can't force a free man into writing a unit-test, and I don't want to waste my time writing a unit-tests for someone else's code: it's usually impossible to do if code wasn't designed with testing in mind (language-of-the-day is not a problem: I like trying new things too).

Wise old men kept telling me that I am wrong to dismiss unit-tests. I dismissed them instead. Now it clicked! There is value in unit-tests, even if for different reasons.

Combinatorial explosion of possible scenarios makes writing black-box tests a hard task. Imagine a use-case. Let's say: 10 steps. Every other step has two (mutually exclusive) failure scenarios. That's 243 possible paths. That's a lot! Yet, that's how many black-box tests I need to cover the code which took me 4 hours to write. With unit-testing I can cover all this ground with only 29 tests (10+5*2+9).

I still don't know how to reconcile unit-testing with poorly written throw-away code (freedom!), but it's a problem worth solving.

TV and newspapers - and their modern equivalents

TV and newspapers are both examples of fast-food for the mind.

There is a big difference: TV is a totally passive endless stream of information, while newspaper can be read in full with inevitable “time without newspaper.” Newspaper also allows choosing the order and the subset of articles to read.

In a web-based world, I hypothesize that Facebook is analogous to a TV (with its news feed, which is ordered and curated completely by Facebook) and Reddit is closer to newspapers (there are subreddits, and it can be consumed selectively).

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

About AWS AppSync

AWS AppSync is a managed instance of GraphQL Java.

It offers in-browser editor for GraphQL schema (SDL) and in-browser editor for resolvers. Resolvers seem to be limited to VTL-templates (Apache Velocity Template Engine), which do the mapping between GraphQL JSON and data-sources (all data-source types I tried have JSON input/output interfaces: AWS DynamoDB, AWS Lambda).

Neat!

Subscriptions work and I am looking at my Python back-end publishing data to DynamoDB using GraphQL mutations, and the same data appearing in the ReactJS client in real-time.

I see myself adding more access-control logic to GraphQL end-point (clients can only view a subset of fields, and can only mutate a subset of fields). I can only add this logic by writing more of the VTL code in resolvers, which isn't too bad. The problem is: I can't version-control the resolvers! Or schema!

Brier Score vs Bayesian Information for scoring predictions

In many circumstances, forecasting (or "forming beliefs under uncertainty," in a more general case) is measured with Brier score. This is the metric of choice for forecasting tournaments such as Good Judgment project.

The beauty of Brier score is that it can be used to score an individual analyst or even to score an individual judgment. Perfect for tournaments!

Real life is different. One of the ten commandments of Superforecasting is: "Triage." In tournament context triaging means focusing on questions which aren't obvious, but still aren't full of irreducible uncertainty. In real life triaging is also about focusing on questions which can impact the outcome.

Scoring predictions by the amount of Bayesian Information which it contributes to the "market" is a perfect scoring mechanism for real life. I got a taste of it during CFAR/MIRI workshop last September and loved it. The number of bits of added Bayesian Information equals to the base-2 logarithm of new prediction divided by current market probability assignment. In other words, if current market probability assignment is 10%, then (given that market evaluates to True, eventually) betting 80% assignment would yield 3 bits of Bayesian Information: 3 = log_2 (80/10) It can be derived using the well-known Value of Information.

The core intuition here is that probability assignment while being perfectly accurate and aggressively precise, can still contribute zero bits of Bayesian Information - if the produced assignment is equal to current "market" level. This is critical in real life. Rephrasing Kahneman: you win not by reacting to new information, but reacting to new information, which hasn't yet propagated through the environment.

In real life (in a game-theoretic environment with the element of conflict) it is not enough to be rational and do computations well. It is also important to "have an edge": be able to form better judgments than others can.

Simply speaking, forming a belief that probability of an interest-rate hike is 80% is useless if financial markets are pricing-in the same probability. In fact, one must not act upon completing such inference.

Estimating Bayesian Information requires the forecaster to be aware of what other participants think. This makes it impossible to score a forecaster in isolation. However, it is a valid requirement in real life.

#bayes #superforecasting

Distributed SGD as an alternative to homomorphic encryption

The purpose of homomorphic encryption is to allow computation on encrypted data. Computation is deterministic and serves some decision-making process.

The purpose of probabilistic model (deep or shallow) is to suggest the best output given input. Probabilistic model always serves some decision-making process too.

An important distinction between computation and probability distribution is that probability distribution compresses information, but not all of it - just the information relevant to the target decision-making process- and discards the rest.

This feature of probabilistic models can be used to share the decision-making process without sharing data. This is analogous to a human-consultant: she knows private information of every client and can't transfer it across clients, but she can transfer her expertise acquired from knowing the private information of clients. Probabilistic models are even better than a human consultant: their capacity can be controlled to fine-tune how much information content of the training-data can be recovered from a trained model.

The most straightforward implementation of such system can look like a distributed mini-batch gradient descent: each party uses its private data to form mini-batches, and they only transfer the parameters of the model between each other - in a cross-organizational distributed SGD process. Neither party is incentivized to spoil such distributed training process with fake data because they won't be able to "recover" unspoiled parameters after other parties perform parameters-updates using their batches.

This last property is somewhat analogous to a block-chain concept: it is impossible to "rewrite" a parameters-update step in the middle of the process while keeping the subsequent steps unchanged.

#ml #pgm #blockchain #homomorphicencryption

Opinion: relation between Machine Learning and Probabilistic Programming

Reposting here my comment on Hacker News:

Both Machine Learning (ML) and Probabilistic Programming (PP) work towards building a mathematical object (model), which takes observation as input and produces prediction as output. Both ML and PP are about finding the parameters of the model, which requires bits of information. Main differences:

The type of bits of information used to find parameters of the model is one main difference.

In ML, information used to "train" the model is called "training set" and usually consists of the observation-prediction pairs (or just observations, for unsupervised learning), which are of the same kind as desired observation-prediction pair. In other words, with ML you "fit" your model using the same type of data, which you will use the model for. At least, this is the most common scenario of ML.

In the problems, which PP specializes in, this is not as prevalent. For example, with PP we may often see a model which predicts where the planet is, while the model is itself trained using apples falling from the tree.

The type of output is the second difference.

ML focuses on producing a single value of the most likely prediction (Maximum a posteriori estimate), while PP produces a probability distribution of the predicted quantity.

Arguably, the probability distribution of the prediction is more useful for any problem where we need to use the prediction for some decision-making process because then we can calculate the expectation of (mis)decision-cost over possible values of prediction.

Practically, many decision-making situations can be transformed into estimating some quantity (instead of taking some arbitrary action) and ML model can be built to directly produce prediction, which is at the same time a decision.

The speed of prediction is the third difference.

ML model, after the "slow" training phase, produces a mathematical object, which is a straight function and can be very quickly calculated for any observation to produce prediction (think: forward-pass in a deep-learning network).

With PP, with most common models, you need to go through a "slow" inference (MCMC-like) process for every new observation to produce a prediction. It is possible, in some cases, to design a probabilistic model such that it only needs to be fitted once and can then be transformed into an analytical expression (function) producing predictions from observations.

As the result of these differences, people applying PP and ML are of different characters :) Still, I expect ML and PP to converge over time. Specifically, I notice that:

ML-models are used to represent probabilistic distributions in probabilistic models; and

trained ML-models are being interpreted in terms of their information content (in probabilistic terms, as if ML-model is probabilistic model).

Using VR for ground-truth annotation

VR provides a strong bi-directional interface between a human and a computer.

While planning to replicate Disney Lab's experiment, I ran into the problem of annotating a video-stream. In particular, I need to label a moving object in a 2d-video.

Human's head and neck are well suited to track a moving object in a surrounding environment (better than a hand with mouse or touchpad is). With VR we can make a human to track a virtual object and produce labels - by moving his head.

For those who did clay pigeon shooting it's a very familiar exercise.

Tracking eye-movements directly may be even more promising, as suggested by my good friend Zellmer. I will try that later.

In VR we can easily visualize a 4d dataset (3 physical dimensions + time), and with clever enough UI annotate it with ease.

I am excited to be building this!

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

Expected Utility in Real Life

I responded with the following to Eliezer Yudkowsky this morning in https://www.facebook.com/yudkowsky/posts/10155299391129228

"Expecting utility" is only convincing if probabilities convert into frequencies. When faced with 10/9-odds bet on a fair coin I will be willing to bet $100 but I won't be willing to bet $1,000,000. And not only because of non-linear utility of money, but because I am sure I won't encounter enough million-sized bets of similar structure throughout my lifetime for "expectation" to materialize. But I am sure that I will encounter plenty of $100-sized "better-than-even" bets on a random binary event.

I call it "opportunities abundance". When considering writing a book, I can stay rational and conclude that: odds are high, probability is low, and I have enough years-of-life left to make the expected utility positive.

When considering a startup I can conclude again that I can attempt one every 5 years. That's 5-10 per lifetime. Not many, but not a "single shot", and enough for me to rationally accept "expected utility" calculation for the startup.

And it doesn't have to be 5-10 startups! It can be 2 startups and 5 dramatic career changes, or big stock-investments.

For me, the strategy to solve the posed problem ["most people seem to require catastrophically high levels of faith in what they're doing in order to stick to it" - Eliezer], therefore, is: live longer, accept that opportunities are abundant (even if they look very different, but have similar odds/probabilities-structure and size).

One, two, three

This is a test for search-engine indexing and web-archiving.

Trending Blogs

Last Seen Blogs

Random Thoughts