tag:blogger.com,1999:blog-23532223926406167322024-02-08T17:24:44.749+01:00The Flying MoleUnknownnoreply@blogger.comBlogger16125tag:blogger.com,1999:blog-2353222392640616732.post-66023053719439753552017-07-13T01:34:00.003+02:002022-11-20T23:36:20.278+01:00Building Docker "Toolbox" for 32bit host system: docker-cli 17.07 and docker-compose 1.15<script type="text/markdown">
<![CDATA[Background story is simple: I've got x86 Windows7, and wanted to run docker. Docker-for-Windows says it needs Windows10Pro or better due to its improved Hyper-V support. Eh. Fortunatelly, they provide so-called "Docker Toolbox" for other systems that does not match requirements for normal installation. Docker Toolbox is essentially a VirtualBox setup that will host a VM inside which Docker will be able to set itself up. Alright! I tried. Guess what! Docker Toolbox needs 64bits.
I couldn't understand why it's not available on x86, it sets up a goddamn VM, and VirtualBox works on 32bit as well! I searched a bit and I found out that there were some recent patches and updates to enable running boot2docker image on 32bit. I think Stefan Scherer developed a patch allowing that. It was ~1 year ago, so I searched for some instructions how to apply that. I think my start point in that matter was [this article](https://medium.com/@chrispatten/installing-and-running-docker-on-32-bit-windows-d18b95ee1fc3) and some further articles indicated there. Their story was simple: everything works, you just need to prepare manually the initial VM, you get rest from chocolatey repository.
Unfortunatelly, these articles are a bit outdated and it turns out that today there's much more work to do than was described. Installing docker-machine via choco fortunatelly indeed worked fine! However, when I wanted to install current stable Docker version 17.06 on my x86 machine - Guess what! - [since version 17.06 they stopped providing x86 builds](http://disq.us/p/1ke6ibv), and more, they seem to have removed old x86 builds as well (only x86_64 in https://download.docker.com/win/static/ subfolders). [Hi Murphy!](https://en.wikipedia.org/wiki/Murphy%27s_law)
But, since you are here, you're probably interested in how to get them back. Skip to the end of this text for a link to binaries, but be warned that at the time of reading this, they may be old, and I probably won't be rebuilding and updating any time soon.
Setup tools and freebies
========================
Off we go then:
1. Install VirtualBox (https://www.virtualbox.org/wiki/Downloads)
2. Install chocolatey (https://chocolatey.org/install)
3. Install docker-machine via choco (`choco install docker-machine`)
4. Create a default VM through docker-machine (`docker-machine create --driver virtualbox default`)
5. Check if the VM works (`docker-machine status default`)
6. Check if you can get the config (`docker-machine env default`)
7. Store the config for later (`docker-machine env default > env.bat`)
If docker-machine does not work for you, **panic**. Maybe you will need to build it yourself as I did with docker-cli and docker-compose. I didn't have to, so I don't have a solution/instruction for that, sorry.
Rest of the article assumes that docker-machine still supports x86 hosts and that installing docker-machine succeeded.
8. Try installing docker (`choco install docker`)
9. Try installing docker-compose (`choco install docker-compose`)
10. Check if it works (`docker version`)
11. Check if it works (`docker-compose version`)
Most probably, last two points will fail with "program is not valid win32 application" error or similar. This means that choco has found only x86_64 version and installed that for you, and, well, it's unusable. Uninstall them by
12. `choco uninstall docker`
13. `choco uninstall docker-compose`
Btw: If you don't need the latest version of docker, you can try installing some older one. I found out that 17.05 runs fine under Windows7 32bit - you can get it by `choco install docker --version 17.05`. However, at the time of writing, you won't get docker-compose that way, since they only provide x64 binaries. If you found yourself an acceptable and working version of docker, skip the next chapter and fast-forward to buiding docker-compose.
Building docker-cli for x86
===========================
Let's get the basic docker commandline utilities, the "docker-cli" project. For that, we have a small chicken-and-egg problem, since the build process actually required a working docker installation. However, we already do have a docker-machine up and running, and that's exactly what's neeed.
But first, we need to get inside it. If you ran VirtualBox GUI, you will be able to simply show the VM's screen and use it directly as a terminal. However, **I don't recommend it**. Clipboard will not work, for example. I prefer to use SSH, any client will work, I like PuTTY. Peek the `env.bat` from point (7) and check the IP address of the VM and connect to it through SSH. At the time of writing, the default login:pass is docker:tcuser. Watch out - while default VM's screen from VirtualBox, which logs you in as `root` - the SSH will log you in as `docker`. Some commands may require elevation, but you can even `sudo sh` and be `root`..
Once you have access to the terminal, check if it really came with some default tools and also check if the network access works.
14. `docker version`
15. `git --version`
16. `ping www.google.com`
If anything of those does not work, you will have to fix that somehow to continue.
Assuming everything's good, fetch the sources:
17. `cd /root`
18. `git clone https://github.com/docker/cli.git`
19. `cd /root/cli`
and do not build it yet, you would just waste your time.
At the moment of writing, the default scripts will build you linux, osx, or windows version, but the windows part is configured to x64, so it needs a few updates.
At some point of time I found this articles, which gave me a great kick-off in that matter:
- [this article "Docker Daemon for x32 architectures"](http://www.nirmata.com/2016/02/09/docker-daemon-for-x32-architectures/)
- [or that article "Installing the Docker Client CLI on 32-bit Windows"](https://thesocietea.org/2016/04/installing-the-docker-client-cli-on-32-bit-windows/) (it's from 1st April, but no fools there)
- https://gist.github.com/prateekgogia/05f058bafbccc2478fcc/2f6c7b632a7b391618a3ad6cfdce949142efa3c2
I think I got that last link it from first article, and then followed GIST updates. Anyways, it was supposed to replace the default 'Dockerfile', however.. if I remember well, the build scripts have changed since it was last posted, and I'm pretty sure that eventually I didn't use it at all. I don't remember, and I can't retry this right now. I mention it there only because these articles were important and because it may be needed for someone in case I screwed up this instruction.
When I noticed that I cannot use that Dockerfile from prateekgogia's GIST, I looked around and noticed that docker-cli is written in GO-LANG. I thought that this one should have not a single problems cross-compiling for win32 so I just changed the build scripts to target different platform:
20. `cd scripts/build`
21. edit the file called `windows`
At this moment you will probably notice that there's no software.. unless you want to cat/sed, let's install some. It's not debian/ubuntu/redhat/etc, we're on TinyCoreLinux, package management fortunatelly exists, but repo is limited. I found two editors capable of working in terminal:
- `tce-load -wi nano`
- `tce-load -wi vim`
One is enough, pick what you like the most or dislike the least.
Once you have editor, edit file called "windows" and change BOTH:
- `CC=` from `x86_64-w64-mingw32-gcc` to `i686-w64-mingw32`
- `GOARCH=` from `amd64` to `386`
Remember to save that file.
In case you noticed 'w64' in the mingw package name, it's OK. That version can build both x64 and x86 binaries.
Now, try building the cross-compiled profile:
22. `cd ../..` (so you're back in 'cli' folder)
23. `make -f docker.Makefile cross`
The line (23) is taken from readme available at docker-cli's github page. If something goes very wrong, maybe it has changed since I wrote this text. Just go there and check what's the current startup line, maybe you will even find better instruction than mine here.
Anyways, this line will, again, most probably, fail due to `make` command not being present. Just install it in a similar way to installing text editor:
- `tce-load -wi make`
and retry line (23).
It will take some time. Build process will download mingw, download and run some containers for the actual build process, and so on. After it built everything, it may end up with crash on copying the final output .EXE file. I might have mixed the facts with how building docker-compile worked, so I'm not sure if it happens at all, but if it does, just check the 'dist' folder for .EXE file. If it has some x64 or amd64 in the filename, don't believe that file name - we changed target architecture in build scripts, and that file name may simply be wrong.
24. Copy that file to your win32 host machine to some folder of your choice
25. Rename to `docker.exe`
26. Try running i.e. typical `docker version` to check if it works.
27. Add location of that file to PATH environment variable
Since you have now both `docker-machine` and `docker` commands available, you may use now them to clean up the containers and images that were used by the build process. That's optional, you can leave them there if you plan to do more builds in future.
I suppose you also may ignore the root docker container that you used as the entry-point terminal to download the sources and edit configuration files. That container seems to be automatically purged and cleaned at every (re)start of the docker-machine.
Btw. If `docker` complains about missing settings, use `env.bat` file to set them. It may be a good idea to refresh that file at times, see line (7).
Building docker-cli for x86
===========================
It turns out that docker-compose is written in Python not in GO, and that at the time of writing this text, Python tools that build .EXE files lack support for cross-compiling between x64 and x86 (or I may have just failed to find how to do it - however many articles claim that this option in PyInstaller was recently removed). If you want to get x32 executable, you have to run the build process on x86 machine. I suppose we already have one: your host machine that you want to run docker on.
28. `git clone https://github.com/docker/compose.git`
29. view `script/build/windows.ps1` and follow the procedure stated in the opening comment
30. Install Python 2.7.x if you don't have it yet (https://www.python.org/downloads/)
31. Ensure that `python` and `pip` are both at PATH (if not, add something like "C:\Python27;C:\Python27\Scripts" to the PATH)
32. `pip install virtualenv`
33. `powershell Set-ExecutionPolicy -Scope CurrentUser RemoteSigned`
34. `cd compose` (the root folder of the sources)
35. `powershell -file .\script\build\windows.ps1`
Again, it may take a while to build. It's all in Python and it uses PyInstaller to generate .EXE file, and it is smart enough to notice that your current OS is 32bit and will build you a 32bit executable. Again, it may fail at the very end, during file-copying. If it does, look for a new .EXE file that should show up in the `dist` folder. Probably, it's name is `docker-compose-Windows-x86.exe` and that's the problem, because build process is hardcoded to copy `docker-compose-Windows-x86_64.exe` (last lines of the .ps1 file). Anyways, we don't care. We got out .EXE file.
36. Copy that file to some folder of your choice
37. Rename to `docker-compose.exe`
38. Try running i.e. typical `docker-compose version` to check if it works.
39. Add location of that file to PATH environment variable
Now, optionally, you can delete the folder with docker-compose sources, and uninstall Python27, and remove Python and Python\Scripts from PATH.
Disclaimer
==========
As usual, there may be some minor gaps and noise in the process I described above, I wrote this instruction basing on my notes gathered through the last few days. I made many consecutive attempts before achieving the goal of having x86 versions of docker-machine, docker-cli and docker-compose, so I might have forgotten/overlooked/mixed up some facts, like forgetting about `tce-load -wi` something extra. However, I'm sure that overall process looked like I described above, and I got it working. I managed to run hello-world container, ubuntu container, and ubuntu32 container, and manage them, so.. seems to work.
TL;DR
=====
I built them, so <strike>you can get them here</strike>:
- [<strike>docker 17.07</strike>](https://drive.google.com/open?id=0B1179asykTa5N3FSb3BGc0htMVk)
- [<strike>docker-compose 1.15</strike>](https://drive.google.com/open?id=0B1179asykTa5OUthRURncUtCSE0)
Please be aware that those builds were done directly from the main git repos, so they are DEV/EDGE builds, not STABLE builds. Since the version numbers are not frozen (and i.e. `docker-cli` got already some new commits for 17.07 so if yo do your build of 17.07 you'll get a newer 17.07 than my 17.07), I included a timestamp and GIT SHA at the end of the filenames so we can later identify what's the exact source code version. Both tools also seem to identify themselves via `version` command, including commit hash.
<i>Update: ~5yrs passed, I removed those builds from file hosting. They were probably criminally outdated already anyways. If you relied on them in any way, sorry, deleted permanently.</i>
]]>
</script>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-2353222392640616732.post-10099454550471516512016-10-17T22:01:00.001+02:002016-10-17T22:02:16.209+02:00Getting faster transfer speeds when reading measurement results from Rigol DS1054Z<script type="text/markdown">
<![CDATA[After reading countless reviews, issues discussions, comparisons, price tags and all, I finally bought a Rigol DS1054Z oscilloscope. It was some time ago, somewhere in June/July this year. Until then, I used a Hantek DSO-2090 "PC oscilloscope", or rather, a kind of a data logger shield connected by USB, which means you can't actually do a thing unless you have it connected all the time to a computer with a proper software running. No display, no knobs. Just a box with three BNCs for CH1/CH2/EXT. It has its issues, but I learned a ton. Since it came with its own proprietary software (not that bad actually) and some API with almost no docs, it was obvious that device can be fully programmatically controlled from the PC. I researched the API on my own and started writing my own software for it. But that's another history, maybe I will have time to write it all down some day.
Anyways, I got really used to it over time and I wanted to check if I can do the same with a "real oscilloscope". That was one of the reasons for selecting DS1054Z. Its price is relatively low and it has a quite well documented programmatic access. When I started to play with DS1054Z, the first thing which surprised me was .. I couldn't reliably download the measurements results from it.
Note: I use Windows 10 and NI-VISA/IVI drivers. I know it's a bloat for my use case. I'm not using LabView, I'm not building automated lab (maybe later). I have this one oscilloscope, a PSU (axiomet, that one I wrote earlier about), and that's all. I took these drivers because .. they seemed to be default ones suggested by manufacturer. I would really like to strip that 1GB software package down to the actual few megabytes of device drivers needed. I even lost a few hours trying to rip them out, but failed, and I don't care about the harddisk space much enough to spend more time trying. I also didn't want to start researching raw device protocols like I had to with Hantek.
_**IMPORTANT**: I actually **have not verified data** that was read from the device during these tests; in some or all cases I could have got a total garbage, or a mixture of all channels, instead of the channel I wanted to read from. The "fun fact" from part two indicates such possibility. I was focused on communications and simply didn't have time to verify the data yet. When I read from 1-chan or 2-chan modes and then looked at raw binary data, it looked fine. But that doesn't tell anything. I need to generate some images and compare to on-screen data to be sure. Verifying the structure of received data and determining which modes are usable and in what way is the next thing I'm going to investigate. I will remove this warning text afterwards._
Intro: basic setup
==================
My current DS1054Z firmware, **00.04.03.01.05 "SP1", built 2015-05-26 08:38:06**, has a really interesting way of determining how much data can you fetch from it in one go.
The device has a memory of ~24M samples. It can be used for a single 1-channel measurement. I started with that and tried to download all 24M samples. According to the "MSO1000Z/DS1000Z Series Programming Guide 2014", it is not possible right away.
First of all, as noted in the Guide for `:WAVeform:DATA?` command, there are various "areas" that the data can be downloaded from. Namely, at least two: screen and memory. By default, device will return samples from "screen" (`:WAVeform:MODE NORMal`). You can get them always, even if the device is working, but the data returned is .. the data you currently see on the screen. It's fast, it's easy to fetch, but I wanted to get whole 24M of the measured data, not just a ~1000pts (1K) post-processed fragment of it.
To access the other buffer called "waveform data from internal memory", you have to switch the `:WAVeform:MODE RAW` option first. I'll skip the `MAX` option since it doesn't change much. Also, the device cannot be measuring at the same time, it has to be in a "STOP" state. Basically, if the device was left in 'auto' or 'waiting for trigger' states, you can assume that the data of the previous measurement has already been (partially or fully) overwritten by new samples, and you can safely get only the screen data.
Another intersting thing is that the `:WAVeform:DATA?` won't usually return you the whole data. It actually returns only a part of the data taken from a range set up by a pair of `:WAVeform:STARt xxx` and `:WAVeform:STOP` commands. For example, here's an attempt to read two consecutive chunks of 1000 (1K) samples:
:WAVeform:STARt 1
:WAVeform:STOP 1000
:WAVeform:DATA?
> #9000001000........
:WAV:STAR 1001
:WAV:STOP 2000
:WAV:DATA?
> #9000001000........
...
As you see, the 'addresses' start at `1` (not zero), and the START-STOP range is inclusive-inclusive. As you may guess, the last address is either 12000000 (12K) or 24000000 (24M) (depending on your device feature state).
There is a reason I started with this example. You cannot just set START=1 STOP=24000000 and have fun:
:WAV:STAR 1
:WAV:STOP 12000000
:WAV:DATA?
> #9000000000\n
Here, I tried to read too much in one go. Device didn't raise an error. It simply responded: ok, zero bytes for me - and it actually included no measurement data in the response.
There is a limit to the amount of data that can be read in one block. The device communicates via USB/etc and it seems to have a limited communication buffer which simply cannot be exceeded. If you want more than that, you need to read it block-by-block, like in the first example.
Also, it's good to note the `:WAVeform:FORMat` option. As I saw in various articles, many folks out there use `ASCii` option so they can easily process the data in scripts, spreadsheets, etc. But this means that the device has to print out the text instead of raw data, and the text eats the buffer much faster. I wanted to download as much data as possible, so I use `:WAVeform:FORMat BYTE`. The measurements are 8bit anyways and this format saves much comms time.
I also want to use as large blocks as possible. For example, I could read using blocks of 1000 (1K) samples as in that first example, but then to download whole 24M memory, it would need .. 24000 "start-stop-data?" queries, but it would mean a ridiculously high (really) waiting time for me. Something like .. 2 hours .. probably. That's not what I would like to wait after each measurement :) Fortunately I don't need to, since that 1k window was just written there for an example.
Sadly, "MSO1000Z/DS1000Z Series Programming Guide 2014" (that's the one I used at first) doesn't give any hints on what are the buffer limits. I later found "MSO1000Z/DS1000Z Series Programming Guide 2015" and there you can find a table (page 2-219):
BYTE -> 250000
WORD -> 125000
ASCii -> 15625
Even from this table you can see that picking `ASCii` mode means lowering the transfer rate to 15.6 kilo-samples per query. Comparing to RAW/BYTE mode and its 250K, this means over 16x slower transfer. For those that don't "fell" the multipliers: instead of a 1 minute, you would need to wait 16 minutes.
As I said, I found 2015 guide much later, and I already had discovered this myself, as well as the fact that this numbers are **not accurate**. These numbers are **safe**, meaning, you can (probably) use this numbers at any time.
For a single channel, I am able to successfuly use ~1180Ks blocks. That's over 4x the number mentioned in the docs. I have not hacked the firmware or hardware. It's just that the docs didn't want to delve into really detailed details of how to get that speed.
Comparing things to the value suggested in docs:
Channels: CH1
Mode: RAW
Format: BYTE
Memory: 24.0M 12.0M 7.5M* 6.0M* 2.4M* 1.2M
Block=250K
Queries*: 95 47 29 23 10 5
Time: 56.6s 27.2s 17.1s 13.2s 5.7s 2.7s
Block=1180K
Queries*: 21 11 7 6 3 2**
Time: 16.5s 8.5s 5.3s 4.3s 1.8s 0.9s**
\*) these acquisition memory depths are not available "by the knobs" on the device front panel, hovewer you can get them using "AUTO" memory depth
\*\*) sadly, the best transfer I got was 1180K and here the source memory a tiny bit larger than that, so two queries were needed
I'm not including results for memory size smaller than 1.2M simply because it would all fit into one query. I'm not including results for smaller block sizes, because the programming guide gave such a suggested value, I see no point in limiting the block size to lower than that. Also, I'm not including comparisons to ASCII format, since.. I'm not a masochist. I want the data, I don't want to sit and grew old waiting for it.
My test code isn't super-optimized so you may be able to squeeze a better transfer times. You could probably remove some assertions and some noncritical commands sent to the device during a single "query", but the point is that the higher block size you can get, the lower your waiting time will be. It may not look like much for 1.2M depth, but for the higher ones it really makes a difference.
However, to get 1180K blocks in a reliable way, you need to prepare both yourself and the device for that first.
No hardware hacks needed.
What's the problem?
===================
My current DS1054Z firmware, 00.04.03.01.05 "SP1", built 2015-05-26 08:38:06, has a really interesting way of determining how much data can you fetch from it in one go. It consists of two really inobvious parts.
First of all, it's not a nice value of 1180k, but actually, it's a value between **1179584..1179647 samples per block**. Not a random value. You can see that the 'range' is exactly 64 bytes long and that lower bound is divisible by 64b, and upper bound+1 is divisible by 64b as well. In hex that's 0x0011FFC0 and 0x0011FFFF. I suppose that it comes from device's internal memory architecture, grouped in segments of 64 bytes. I have really no idea though, just guessing.
Now, that latter value, the upper bound, looks very tempting! (Prett-y shin-y round-y 0x120000 tempts too, but it is not available at all). However, using `1179647` requires some very precise conditions to be met, that even I, after researching it and finding out the rule, I decided to drop it and stick to the lower bound value, 1179584/0x0011FFC0 that is _almost_ always available. (that 'almost' word is the second big part of the mystery, it's covered in the next section of this article)
When I first investigated it, I didn't know the limits, nor the window size. From a few manual attempts I knew how a successful read looks like and also I knew that if the requested block size is too large, the device will respond in #9000000000 empty response instead of some helpful kind of a range-error. I wrote a small application to scan various setups and bisect the blocksize. It would start at some position, with blocksize=1 and blocksize=24M and would try intermediate values until blocksize=N is ok, and blocksize=N+1 is not ok.
Here's an example of results:
measurement 01, reading at START=1: found best STOP=1179587 => max read block size=1179587
measurement 01, reading at START=1000: found best STOP=1180611 => max read block size=1179612
measurement 01, reading at START=1000000: found best STOP=2179587 => max read block size=1179588
Surprise! I would expect that transfer capabilities would be the same, regardless of position. I left the scanner overnight and let it scan positions at random. It found out that the quite probale min/max blocksize values were 1179584 and 1179647. (Initially I was sure that the max is 1179648, but that was due to off-by-one error due to the `:WAVeform:STARt` beginning at `1`)
I also made some finer-grained sequential scans and they has shown that numbers in range are **not random at all** and that they form a saw-tooth shape:
...
reading at pos=1000: bsmax=1179628 |
reading at pos=1001: bsmax=1179627 | you can see the available blocksize falling by one
reading at pos=1002: bsmax=1179626 | for each increase in position; all on 1-by-1 basis
reading at pos=1003: bsmax=1179625 |
... | but then, suddenly the value jumps
reading at pos=1042: bsmax=1179586 (-61) | from 1179584 to 1179647, and then falls again on 1-by-1 basis
reading at pos=1043: bsmax=1179585 (-62) | then the cycle repeats
reading at pos=1044: bsmax=1179584 (-63) /
reading at pos=1045: bsmax=1179647 (-00) \
reading at pos=1046: bsmax=1179646 (-01) | the length of such 'monothonic window' is 64 samples(bytes)
reading at pos=1047: bsmax=1179645 (-02) | starts with highest possible blocklength=1179647
... | ends with lowest possible blocklength=1179584
reading at pos=1107: bsmax=1179585 (-62) |
reading at pos=1108: bsmax=1179584 (-63) /
reading at pos=1109: bsmax=1179647 (-00) \
reading at pos=1110: bsmax=1179646 (-01) | ..window of next 64 samples/bytes, and so on
...
I thought, wow, so in fact the memory is segmented and it's just that a single query just can never cross a segment boundary. Or something like that. Or more likely something different than that, since this description doesn't fit at all, as I was reading over a one million samples in one go and the segments ('windows') seem to be 64 bytes long, so each 1M query crosses a ton of them. Well, nevermind ;)
Anyways.. sawtooth pattern holds across the whole 24M memory.
But that's not all to it.
I played a little with the DS1054Z, made a few measurements and tried to fetch results using the same windows and offsets, and I couldn't get it working. Just to be sure I noted the values correctly, I ran the scanner again and here's what I saw:
measurement 02, reading at pos=1: bsmax=1179603
measurement 02, reading at pos=2: bsmax=1179602
...
measurement 02, reading at pos=999: bsmax=1179629
measurement 02, reading at pos=1000: bsmax=1179628
measurement 02, reading at pos=1001: bsmax=1179627
...
measurement 02, reading at pos=999999: bsmax=1179605
measurement 02, reading at pos=1000000: bsmax=1179604
measurement 02, reading at pos=1000001: bsmax=1179603
Please compare it with the first measurement and positions 1/1000/1000000 above. Suprise#2! Although the values are still in the same bounds, they are different. Fortunatelly, the sawtooth pattern is still preserved. It "just" "starts" in different place.
Actually, it turns out that the sawtooth pattern gets offsetted after each single triggering and capturing another waveform.
I think that, maybe, while waiting for a trigger condition, the device constantly samples and writes to the sample memory in a ring buffer scheme, and when trigger condition is detected, the device simply runs forward and stops to not overwrite the trigger position (minus configured pre-trigger amount to be left visible). That's a pure guess of course. That's how my old Hantek DSO-2090 worked, but then it didn't have any 64-byte windows. It just returned its tiny 10k or 64k of samples.
Whatever is inside the device, the facts are that after each measurement, a random number between 0-63 is selected as the 'monothonic window offset', the 64 byte window at `:WAVeform:STARt 1` is shortened by this value, and then a normal sawtooth pattern follows with full 64 byte windows till the end of the memory (where upon at the very end obviously the very last window will also be shorter).
Knowing that all, we can sketch a following readout procedure:
- we know the min/max blocksize (constants)
- we can learn the current offset by simply trying to read at START=1 and checking which blocksize will be accepted; we can check it by bisecting in about 6 attempts (may be time consuming if we often hit good sizes and get 1M response), or we can start with max blocksize and try-fail-retry decreasing blocksize by 1 each time (max: 64 fast "zero-length" responses)
- knowing the window offset, we can easily determine all the sawtooth peak points
- keep the first partial block that was read at the time of determining the offset
- read all blocks from second to next-to-last at max blocksize=1179647
- calculate how much data is left, and read the last block
Assuming a typical case when the random offset is not zero (assuming flat distribution, that's 98% of cases), we get `floor(MEM/maxblocksize)` full reads followed by `+2` partial reads [read=set start, set stop, query data]. Plus some queries to learn the current offset value. So, for full 24M that's `X+20+2` reads. Nice!
However, as you might notice, the table from the `Intro` section claimed that my code was able to read 24M in 21 queries. How come?
That's simple: I was lazy and I didn't implement it. What I did was take the **lower bound** of the sawtooth pattern, 1179584. This value allows you to completely ignore sawtooth - just because the lower bound value is valid throughout the whole memory - and read the data as it goes in blocks of 1179548 bytes right from the START=1 till the end, where partial block read will occur. That means `floor(MEM/minblocksize)` followed by `+1` partial read, which for 24M gives .. `20+1`. And we drop the 'X' since we don't care about the offset. Well, children please don't listen now, sometimes laziness pays off!
Availability of 1179K blocksize buffer
======================================
If you pick now the magic 1179584 number and try it out yourself, you have a high change of failure. That's because it's the "max value". It does not mean that your device will handle that right away. We still have the second part of the mystery to see.
Since I used 2014 Programming Guide (the one with no hints) I had to guess the blocksize. I remember I tried 12M at first, failed, then 5M, failed, then 2M, failed, then 1M, failed, then succeeded and got data in a stable way at some blocksize around 500k. Few days later I returned to this topic only to find out that now I can use those 1.0M blocks as well. I was suprised, but hey, I got bigger blocks now, great.
However, I remembered seeing "500k" before, so when after some time my device started rejecting 1180K and started to claim that the highest transferrable blocksize is 580K I felt like "I knew it would return".
Here begins a long story of trials and errors, many overnight scans and trying out different setups, which led me to creating a following table:
------- no channel*--------- Marker [+] shows trigger setup
[+] ::: ::: ::: ::: -> 1179k <- TRIG=CH1
::: ::: ::: ::: [+] -> 1179k <- TRIG=AC
------ one channel --------- If trigger is on active channel, it can be ignored
CH1 ::: ::: ::: ::: -> 1179k <- TRIG=CH1
::: ::: CH3 ::: ::: -> 1179k <- TRIG=CH3
[+] CH2 ::: ::: ::: -> 580k <-- **
[+] ::: CH3 ::: ::: -> 580k <-- **
[+] ::: ::: CH4 ::: -> 580k
------ two channels --------
CH1 CH2 ::: ::: ::: -> 580k
CH1 ::: CH3 ::: ::: -> 580k
CH1 ::: ::: CH4 ::: -> 580k
[+] CH2 CH3 ::: ::: -> 290k <-- **
[+] CH2 ::: CH4 ::: -> .... not tested yet
[+] ::: CH3 CH4 ::: -> 290k <-- **
---- three channels --------
CH1 CH2 CH3 ::: ::: -> 290k
CH1 CH2 ::: CH4 ::: -> .... not tested yet
CH1 ::: CH3 CH4 ::: -> .... not tested yet
[+] CH2 CH3 CH4 ::: -> .... not tested yet
----- four channels --------
CH1 CH2 CH3 CH4 ::: -> 290k
----------------------------
Notes to the table:
- effects of 'REF' and 'MATH' were not checked
- effects of various memory and timebase modes were not checked
- (*) actually, when you turn off all channels, most of the things behave as if CH1 were active but just not displayed; I mean, even [SINGLE] button works and actually TRIGGERs and refreshes the waveform
- (**) fun fact: the tool I build for testing had the data-reading queries hardcoded to read from CH1. As you can see in the table, it successfully read the data when CH1 was the active channel was i.e. CH3.. I wonder, what was actually read there? :)
As you can see, most of the times I had trigger on CH1. It had me confused for some time, and the relation between active channels and max-blocksize was unclear, until I remembered about the trigger and included it in the table.
To sum up the table, the rule for determining blocksize is quite simple, count the active channels, including trigger, then if the result is:
1 channel -> max blocksize = 1179K / 1 = 1179K
2 channels -> max blocksize = 1179K / 2 = 580K
3+channels -> max blocksize = 1179K / 4 = 290K
If you compare Programming Guide from 2014 to 2015 version, you can find that in the latter one, in comments for `:TIMebase:DELay:SCALe` there's a set of rules to calculate so-called **amplification factor**. The rules for its **channel sum** works seem to be exactly the same as seen here, with a small note that TRIG=AC counts as ZERO (IIRC it is not mentioned in channel-sum rules).
But if you think you can just switch some channels on or off to get a higher blocksize, you're wrong!
During the runtime of the device, the blocksize limit seems constant. I tried various things, including disconnecting USB, resetting by `*RST` command, resetting by [CLEAR] or [AUTO] buttons - nothing seems to change the blocksize limit once it is set.
It seems that the blocksize limit is determined .. _**at the BOOT TIME**_
You can only guess how long it took me to figure that out. Sadly, I haven't wrote it down.
As you know, the device by default, remembers last-used settings. I think you can change that behavior somewhere in the Utility and set it to revert to some preset config instead. Anyways, what counts here is, how many channels are active during boot time. If you turned off your DS1054Z having 3 channels active (or 2 channels and trigger on third), then your **next session** tomorrow will have max.blocksize=290K. Whoo.
An interesting thing is that this works both ways. Once you turn off the device in a zero- or single-channel state, it then boots in 1179K and it seems to keep that mode until shut down, even turning on all four channels during doesn't degrade the max transfer size. No needed to adjust any other options, just turn off all channels before turning off the device.
TL;DR
=====
0) all detailed information contained here applies to firmware, **00.04.03.01.05 "SP1", built 2015-05-26 08:38:06**; I have not tried other versions yet
1) use RAW/BYTE mode to save bandwith; BYTE is not very convenient but it is not that hard to calculate actual values from it
2) magic numbers for transfer size limits:
- absolute max data payload size: 1179647 samples(bytes), but please DONT USE IT; explanation is at the end of `What's the problem` part
- easier to use max data payload size: 1179584, it's almost as high as the absolute max, and with it you can ignore many irritating things
3) if your acquisition memory depth is higher than blocksize, you have to make several batches of 'start-stop-data?' commands
4) when reading, memory adresses start at 1 (ONE). Not zero. Watch out for off-by-ones. START and STOP commands set the data range, both values are **inclusive**. When reading at START=XXX, you have to set STOP=XXX+blocksize-1. Watch out for off-by-ones again. Really. I lost several hours tracking +/-1 errors.
5) if your device rejects 1180K blocksize, **turn off all channels then power off the device**. After you turn it on back, it should be good to go at 1180K.
_**IMPORTANT**: I actually **have not verified data** that was read from the device during these tests; in some or all cases I could have got a total garbage, or a mixture of all channels, instead of the channel I wanted to read from. The "fun fact" from part two indicates such possibility. I was focused on communications and simply didn't have time to verify the data yet. When I read from 1-chan or 2-chan modes and then looked at raw binary data, it looked fine. But that doesn't tell anything. I need to generate some images and compare to on-screen data to be sure. Verifying the structure of received data and determining which modes are usable and in what way is the next thing I'm going to investigate. I will remove this warning text afterwards._
]]>
</script>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-2353222392640616732.post-69835898421978058412015-08-31T14:35:00.000+02:002015-08-31T15:18:14.239+02:00meteor + sanjo:jasmine + urigo:angular + mocks: `ReferenceError: module not defined` in Client-Integration tests<script type="text/markdown">
<![CDATA[
Till now, I've been trying to get client-unit-test mode running and refreshing properly. So far, it all seems running just fine. Next point on my list was adding `urigo:angular` and `angular:angular-mocks` (1.4.2) to see how it works with Meteor and testing.
Setting up a very simple single-file Angular app went fine. I've added some trivial client-unit tests and everything was working just great. Angular-mocks were doing their job well, too. App was refreshing, tests were updated properly. Optimistically, I clicked 'add sample integrations tests'. Finished, ran, green, all great.
However, the sample tests are self-contained. All code is dropped to somewhere in `/tests` directory, nothing really touches the actual application, and of course the auto-generated sample tests have nothing to do with angular. So, I copied/edited some tests from the client-unit mode and ... boom:
ReferenceError: module is not defined
at:
beforeEach(module('mytestapp'));
[It's possible that not only I am seeing such issues](https://github.com/Urigo/angular-meteor/issues/516). Problem seemed quite obvious: In client-integration, AngularJS (or rather, the mocks) were not loaded or at least not available. Quite strange since everything worked fine in client-unit mode. However, I checked configs, packages, and even the emitted source of the test runner, and everything indicated that the angular-mocks are provided. There had to be some other cause.
AngularMocks expect Jasmine to be already present
-------------------------------------------------
AngularMocks 1.4.2 [at line 2190](https://github.com/angular/angular.js/blob/9080d2c53cae9aea9d5a061a5de5b052cea3dcc8/src/ngMock/angular-mocks.js#L2190) expects `window.jasmine` or `window.mocha` to be already defined. Typically, it is set by jasmine-core, but in Meteor, it is the `sanjo:jasmine` package that ensures that `window.jasmine` is properly set. Both the expectation and the provision are performed at package-load time.
For all meteor packages, load order are determined by the interpackage dependencies. Since `angular:angular-mocks` can be used with either Mocha or Jasmine (or probably many other frameworks too), it so it can't/shouldn't strictly depend on either of them. (It seems someone wouldn't want to have angular-mocks-jasmine and angular-mocks-mocha which would probably be the most reasonable way, but, anyways..).
<strike>Solution #1: add proper optional dependecy to angular-mocks package</strike> - failed
---------------------------------------------------------------------------------------------
Fortunatelly, Meteor provides a concept of **weak dependency**, i.e.:
api.use('sanjo:jasmine', ['client'], {weak: true});
which ensures that such package is loaded before the current one - but it is not pulled until something else requires it strongly (weak:false or no specifier at all). That's just perfect here, I wondered, why wasn't that working?
I was suprised to see that despite adding `angular:angular-mocks` by `meteor add`, the only actual Meteor packages here are:
- urigo:angular
- sanjo:jasmine
It turns out that you will not find `angular:angular-mocks` Meteor package on GitHub, nor in the Meteor core set (well, obvious). It simply is not a Meteor package. I checked `packages.meteor.com` package directory and it turned out to be ... [a Bower package](https://github.com/angular/bower-angular-mocks/blob/master/package.json).
..and, as you may see in its [bower.json](https://github.com/angular/bower-angular-mocks/blob/master/bower.json), the only dependency is Angular and there are no weak dependencies. I searched a little and it turns out that **Bower has no such concept** as "weak/optional dependency".
Too bad, I though, but maybe there is some option for Meteor package to indicate a "reverse weak dependency", that is, to indicate not what packages must be loaded before mine, but rather, what packages my package has to be loaded before. Having such option set on `sanjo:jasmine` could mark this package as to-be-loaded-before `angular:angular-mocks`.
Unfortunatelly, this has turned out to be not possible either [because Meteor does not support it now, and it doesn't seem planned either](https://github.com/meteor/meteor/issues/1910). Moreover such feature was requested earlier and seems to have died. Maybe it's worth resurrecting that topic..
Solution #2: wrap angular-mocks into meteor package - ok!
---------------------------------------------------------
Since Bower does not support optional dependencies and Meteor does, that's pretty obvious attempt. I tried to provide my own 'angular-mocks' with that weak dependency specified:
Package.onUse(function(api) {
api.versionsFrom('1.1.0.3');
api.addFiles('client/angular-mocks.js', 'client');
api.use('sanjo:jasmine', ['client'], {weak: true});
api.use('urigo:angular', ['client'], {weak: true});
});
<!-- code break -->
meteor remove angular:angular-mocks
meteor add quetzalcoatl:sanjojasmine-angularmocks
It works! However in the long run it seems pointless. I don't want to create and mantain and update such clone just to correctly specify the deps. In case you have such problem and other solutions don't work for you, you can at least temporarily solve it like that.
Solution #3: add dependencies in proper order - ok!
---------------------------------------------------
Just for fun, I checked if the order of adding packages to meteor app makes any difference:
meteor add urigo:angular my-mocks sanjo:jasmine # works
meteor add my-mocks urigo:angular sanjo:jasmine # works
meteor add urigo:angular sanjo:jasmine my-mocks # works
<!-- code break -->
meteor add urigo:angular angular:angular-mocks sanjo:jasmine # broken
meteor add urigo:angular sanjo:jasmine angular:angular-mocks # works
meteor add sanjo:jasmine urigo:angular angular:angular-mocks # works
What. Well, ok, that makes some sense. It seems that for unrelated packages (I mean, with no dependencies between each other), Meteor seems to keep them in order of addition. `angular-mocks` required `jasmine` to be loaded earlier, and simply adding `sanjo:jasmine` seems to do the trick.
And since it **does not matter** if you do that in **one line or three**:
# will break
meteor add urigo:angular
meteor add angular:angular-mocks
meteor add sanjo:jasmine
<!-- code break -->
# will work
meteor add sanjo:jasmine
meteor add urigo:angular
meteor add angular:angular-mocks
I have to admit that I actually hit that problem because I first created the app, added angular, added mocks, and only after running I noticed I forgot to add sanjo:jasmine.
Therefore, if you have problems with `ReferenceError: module not defined` when using `angular:angular-mocks` and `sanjo:jasmine`, you can simply try:
meteor remove angular:angular-mocks
meteor add angular:angular-mocks
Since you probably already have `sanjo:jasmine`, then it's enough to make sure `angular-mocks` have been added after that.
_However, since there still is **no relation** between those two requested packages, Meteor can in fact load them in any order, and this trick doesn't guarantee anything, it can change with different Meteor versions. It just happens to work today._
Solution #4: create a dummy package that will pull mocks in order - ok!
-----------------------------------------------------------------------
If in the absence of dependencies Meteor doesn't reorder the list of loaded packages, then maybe:
Package.onUse(function(api) {
api.versionsFrom('1.1.0.3');
api.use('sanjo:jasmine', ['client'], {weak: true});
api.use('urigo:angular', ['client'], {weak: true});
api.use('angular:angular-mocks', ['client']);
});
Note I removed all JS files and just added a dep to angular-mocks. It turns out to be working regardless of the order of adding packages
meteor add urigo:angular my-mocks sanjo:jasmine # works
meteor add my-mocks urigo:angular sanjo:jasmine # works
meteor add urigo:angular sanjo:jasmine my-mocks # works
As version numbers are not pinned on these dependencies, Meteor will select the most current ones from the directory, saving me from mantaining a clone of `angular:angular-mocks`.
_Just as with solution #3, however, since there still is **no relation** between those two requested packages, Meteor can in fact load them in any order, and this trick doesn't guarantee anything, it can change with different Meteor versions. It just happens to work today._
Once important note to solution #4: since mocks are now implicitly provided by that dummy module, you have to remove `angular:angular-mocks` or else your app will fetch it on its own and nothing will change.
Afterthoughts
-------------
Of course it's disputable which solution is the best.. The best would be #1 but currently it is not possible. Oh, sorry, the very best would be to have angular-mocks don't expect having a jasmine before it or having angular-mocks-jasmine that depends on jasmine, but that's out of the scope for now.
When you hit this exact problem described here, then when picking between solution #3 and #4 you effectively have a choice between (#3):
meteor remove angular:angular-mocks
meteor add angular:angular-mocks
and (#4):
meteor remove angular:angular-mocks
meteor add my-dumy-mocks
and neither of them actually guarantee you much. So, **pick #3**, least surprise principle.
Similarly, when starting a new project, you have a choice of:
(#3) remember to add the dependencies in correct order
(#4) remember to add my-dummy-mocks instead of angular-mocks
and neither of them actually guarantee you much - but the first has much lower risk of suddenly starting to fail (preserving package-addition-order vs. preserving-order-of-dependencies). So, again, **pick #3**.
In case someone asked: this is why I am not publishing packages for #1 and #4. I also would currently advise against doing it, but of course there's no way of stopping anyone. If you consider doing it, then please, focus on lobbying for reverse weak dependencies (so sanjo:jasmine can fix that) or lobbying for specialized angular-mocks-jasmine (anyways, why just jasmine and mocha are so special?)
]]>
</script>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-2353222392640616732.post-36166294252472873522015-08-28T13:40:00.000+02:002015-08-31T15:17:46.401+02:00Windows + meteor + sanjo:jasmine + karma: hot code updates prevent automatic test rerunning<script type="text/markdown">
<![CDATA[
It turns out that last time I was a bit wrong! Upon hot code update, not only PhantomJS jas problems. Whole Karma has, and has them regardless of the selected browser.
Velocity/Sanjo/Karma/(PhantomJS|Chrome|.+) hangs when Meteor auto-rebuilds your project
---------------------------------------------------------------------------------------
I must have been tired back then and I didn't notice that autoupdate breaks running tests in Chrome, too. [Please see this issue for analysis on that](https://github.com/Sanjo/meteor-jasmine/issues/271). In short, I've learned that this in fact is Karma problem, or rather, Chokidar@Windows problem: when observed directory is removed or renamed, Chokidar stops observing it - and that's exactly what happens to whole build directory. Fortunatelly, after some work and a bit of hints from Sanjo about (re)starting Karma, I managed to work it around by:
- detecting when autoupdate is about to happen
- deterministically killing Karma in that case
- detecting when autoupdate finishes
- restarting Karma right after that
[Here's a patch](https://github.com/Sanjo/meteor-jasmine/pull/272). Yay, now it really actually works with hot updates.
All details are described in the issue, but it's worth pointing out few things:
- since Karma dies just before autoupdate begans, all Chokidars file locks are freed - this means no EPERMs during rebuild, and meteor usually can just `rename` the dirs with no cp-r/rm-r fallbacks
- since Karma dies and rises, there's little point in watching any application files at all - change to client files will cause autoupdate (and restart Karma), change to server files will kill&restart whole app (and Karma too). The patch mentioned above turns these watches off, but leaves watches on test/spec files.
- Meteor raises two interesting 'events': 'message{refresh:client}' and 'onListening'
Message: refresh-client
-----------------------
reminder: hot code updates are either 'refreshable' or 'not-refreshable'. If you change only 'client' code, then it's the former. Clients will refresh and server won't notice. When you change any 'server' code (/server, /lib, ...), then it's the latter and whole app must be refreshed on both sides. By 'refreshed' I mean kill-everything-and-restart.
`refresh:client` is broadcasted by the application that is about to start an autoupdate cycle in 'refreshable' mode. Actually, there's a internal listener on that event which starts the autoupdate process. Listening to that event will run your code **before** the autoupdate starts, but only because that internal listener uses `setTimeout(.., 100ms)` to delay the process just a bit, so any custom listeners added must act .. quickly or assume running "in parallel" with the build.
During 'non-refreshable' mode does not raise that event, as the app is going to terminate and fully restart in a moment, what will cause all clients to refresh just like during any server reboot.
WebApp.onListening event
------------------------
Basically, whatever listener is registered there, it's invoked when the hosted application is ready to serve requests. This surely means that any build processes have finished. It is invoked in all three cases:
- application's initial start
- after refreshable update finishes rebuilding
- not-refreshable update (because it's in fact 'initial start' anyways)
However, using this event needs some care. When raised on application's start, it is called **before Velocity** starts its startup tasks. During **refreshable autoupdate** it seems to be called last. Here in this case the handler wanted to start Karma after refreshable update, but the same handler is called during app's start when few-second-later the Velocity would start Karma too. That means that handlers may need to check at what stage they were invoked, just to make sure to not race with Velocity at application's start.
Another (small) catch with onListening is that it does not always run when serving starts. If the app is already running, registering handler to that event **will not wait for the next update/restart**, but instead will immediatelly that handler. Only handlers registered **before the app is ready** are delayed, stored and executed when its ready.
Another (big) catch is coming along with previous: I really said `delayed, stored`. Handlers registered when the app is already running **are not stored** and they are just run immediatelly, and they will not run upon next update/restart. Actually, those handlers registered early will not fire either. It turns out that the list of delayed handlers is cleared after the event occurs. List is cleared as handlers are invoked.
This causes interesting problem: It's tricky to permanently listen to `onListening`. You have to register the handler at the correct moment of time when the application is not considered to be running or else the handler will immediatelly fire. Fortunatelly, I noticed that one such points of time is the `message-refresh-client`. This notification is sent when preparing for a quick rebuild, so any `onListening` registered during `message-refresh-client` will be delayed and called when refresh cycle ends.
Any examples?
-------------
Just in case you missed the link, [here's the patch](https://github.com/Sanjo/meteor-jasmine/pull/272) that fixes re-running tests when hot code update happens. It uses these two events to do the job of restarting Karma. Oh, and also [some small tweaks to sanjo-karma](https://github.com/Sanjo/meteor-karma/pull/9) were needed to be able to actually stop Karma.
]]>
</script>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-2353222392640616732.post-65242234988735015722015-08-19T15:34:00.001+02:002015-08-31T11:44:18.428+02:00Windows + meteor + sanjo:jasmine: fighting with EPERMs and quest for HiddenChrome<script type="text/markdown">
<![CDATA[
I recently tried using [Meteor](https://www.meteor.com/) to build and test a simple application. Since I got used to Jasmine, I tried the mainstream plugin [sanjo:jasmine](https://atmospherejs.com/sanjo/jasmine) for Velocity bundled with Meteor.
When having some unit-tests, autoupdates cause "Error EPERM, stat" and/or "Error EPERM, unlink"
-----------------------------------------------------------------------------------------------
I have to say, I'm not delighted with testing experience. At first glance it looks great. However during first few days I stumbled upon several bugs related to deleting directory trees in autobuilding/autorefreshing, very annoying, it literally killed (I mean, **crashed**) the server on each code update and forced me to rerun it manually - which invalidates the whole live/hotswapping idea. Ah yes, I forgot to say I'm on Windows. It probably works perfectly on Linux, where the filesystem behaves differently. Poor me.
I was able to trace and somewhat patch them up, but it doesn't 'feel stable'. I have observed that during autoupdate, even with those patches, the `renameDirAlmostAtomically` still falls back to cp-r/rm-r fallback, what probably shouldn't have place. Anyways, at least meteor stopped crashing due to EPERMs :)
You can find [the rimraf patch here](https://github.com/isaacs/rimraf/issues/87) and [the meteor patch here](https://github.com/meteor/meteor/pull/4933)
Velocity/Sanjo/Karma by default opens a Chrome window to run the tests
----------------------------------------------------------------------
Next thing that bothered me was the Chrome window that was repeatedly opened and closed by Karma on each autoupdate. I understand that this is necessary as the whole app is being killed including Karma and that Karma need to have an actual runner, but damn! when you work on a **single monitor** work station, you can't even get that Chrome minimizes because it will get soon killed and restarted and a new window will popup, cluttering your workspace again.
Tired by it I searched the web for some solution and I found that it's relatively easy to switch sanjo/jasmine to use PhantomJS instead of Chrome. Great, I knew PhantomJS back from working with [Chutzpah](https://github.com/mmanela/chutzpah). However, despite everybody talking how great is using hidden PhantomJS instead of windowed Chrome, the hints on how to do that varied. The most common were:
- set environment var `JASMINE_BROWSER=PhantomJS meteor`
- set environment var `JASMINE_BROWSER=PhantomJS meteor run`
- set environment var `JASMINE_BROWSER=PhantomJS meteor --test`
- set environment var `JASMINE_BROWSER=Chrome PhantomJS meteor run` (sic! what a pity I didn't save the link where I saw that)
I wouldn't be writing about it if any of those worked. None was. After looking at the sources I noticed that this is almost directly passed to Karma, and that:
- this option takes the keyword like 'Chrome' or 'PhantomJS' only, no "commands" like `meteor --test` are supported
- this options **does not** support specifying multiple browsers, even though Karma could
So, at the time of writing, sanjo:jasmine ver 0.17.0, the only proper way of setting this variable is:
JASMINE_BROWSER=Chrome
JASMINE_BROWSER=PhantomJS
otherwise it won't run anything, as sanjo:jasmine will not be able to pick a 'launcher' (karma-chrome-launcher, karma-phantomjs-launcher, etc) and Karma will start with no runners (so i.e. you will still be able to manually open localhost:9876 in some Chrome window).
So far so good. I stopped meteor from crashing, I switched it to PhantomJS, got rid of Chrome window! Did I mention that instead of a Chrome window I got now an empty console window?
Velocity/Sanjo/Karma/PhantomJS by default opens a console window to run the tests
---------------------------------------------------------------------------------
I started to feel a little grumpy about that, but I kept digging and I've found that somewhere in sanjo's `long-running-child-process` there's a series of `child_process.spawn`s that run a spawnscript with stdio bound to logfile, and that script in turn creates child PhantomJS process. That 'spawnscript' had an option `detach: true` set. That comes from NodeJS and is probably important on Linux to 'daemonize' a child, but on Windows the parent processes does not wait for their children anyways, there's no point in that option. I removed that option and wow, console window disappeared. Now I had the test process truly hidden.
You can find [the patch here](https://github.com/Sanjo/meteor-long-running-child-process/pull/8)
Velocity/Sanjo/Karma/PhantomJS hangs when Meteor auto-rebuilds your project
---------------------------------------------------------------------------
How great was my surprise and disbelief when I noticed that the Velocity html reported stopped refreshing the test results ..but only sometimes:
1a. (re)started my meteor app, page got autoupdated, all tests were run
1b. edited specs, page was not updated (good!), testrunner noticed it, tests ran, results displayed (cool!)
1c. edited apps code, page was autoupdated (goodb!) ..but tests were **not** re-ran, old results displayed (wha?)
2a. (re)started my meteor app, page got autoupdated, all tests were run (ok, so nothing broken)
2b. edited apps code, page was updated (good!) ..but tests **not** ran, old results displayed (see the pattern)
3a. (re)started my meteor app, page got autoupdated, all tests were run, everything fine
3b. edited specs - everything ok, tests ran
3c. edited specs - everything ok, tests ran
3d. edited specs - everything ok, tests ran
3e. edited apps code - app updated, tests **not** ran (yup)
3f. edited apps code - app updated, tests **not** ran
3g. edited specs - everything ok, tests **not** ran (what?!)
3h. edited specs - everything ok, tests **not** ran (...)
4a. (re)started my meteor app, page got autoupdated, all tests were run, everything fine
Noticeable time later wasted on tracing the problem proved that it happens **only when using PhantomJS** and **only when the change to the application code causes restart**. I'm pretty sure it's not a fault of PhantomJS. In .meteor/local/log/jasmine-client-unit.log I've found some errors regarding `EPERMs` and startup crashes mentioning that `Package` is not defined. Seems that PhantomJS was unable to reload some of the files, maybe there were not properly deleted or properly replaced. I wasn't able to trace it yet.
As I said, I noticed this problem occurrs <strike>only</strike><sup>\*)</sup> when PhantomJS is used. When Chrome is used with its irritating window popping up, <strike>everything works fine</strike><sup>\*)</sup>. I switched back to using Chrome and focused on hiding the window. Unfortunatelly my time was getting shorter and shorter. I tried to find the place when Karma actually starts the child process so I could use some old tricks to send a hide-window message to it, but first I've found `karma-chrome-launcher` module that finds the Chrome executable and builds the command line options. Hey, maybe Chrome has some cool commandline switch to hide the window?
<sup>\*)</sup> _it turned out to be completely not true._ [see next post on that](http://quetzalcoatl-pl.blogspot.com/2015/08/karmavelocitymeteor-hot-code-updates.html)
Hiding the Chrome window
---------------------------------------------------------------------------
After some searching, I learned that most opinions tell it's not possible and that I have to use some third-party tools to send a message to its window to hide it (yeah, echo!). I ignored that and searched for the list of Chrome's commandline options. [Here it is in raw form](https://src.chromium.org/svn/trunk/src/chrome/common/chrome_switches.cc) and [here it is in a nice table](http://peter.sh/experiments/chromium-command-line-switches/).
As you can see there, there are some promising flags!
// Does not automatically open a browser window on startup (used when
// launching Chrome for the purpose of hosting background apps).
const char kNoStartupWindow[] = "no-startup-window";
// Causes Chrome to launch without opening any windows by default. Useful if
// one wishes to use Chrome as an ash server.
const char kSilentLaunch[] = "silent-launch";
I managed to successfuly run Chrome with `--no-startup-window` and indeed it launched without any windows. It looked like it launched properly, it spawned all typical children, but the website I tried to make it load inside didn't seem to be actually visited. It maybe possible that this headless mode is only for running apps and not for visiting sites headless*), but it looks very promising as the normal worker tree is set up, just no windows. Yet my time was getting even shorter (probably the counter already has rolled into negative values) so I have not investigated further.
The second option `--silent-launch` made chrome process **very** silent. I didn't notice any children spawned at all and the process exited promptly. I doubt it'll be usable for this case.
After I failed my attempts with these options, I turned to less sophisticated ways. On the bottom of the list there are two options:
// Specify the initial window position: --window-position=x,y
const char kWindowPosition[] = "window-position";
// Specify the initial window size: --window-size=w,h
const char kWindowSize[] = "window-size";
I edited **karma-chrome-launcher\index.js** to include options to move it completely out of the working area:
return [
'--user-data-dir=' + this._tempDir,
'--no-default-browser-check',
'--no-first-run',
'--disable-default-apps',
'--disable-popup-blocking',
'--disable-translate',
'--window-position=-800,0', // <-- added
'--window-size=800,600' // <-- added
].concat(flags, [url])
and sure it's no true headless, but still the window is *finally out of my sight* and tests are properly re-ran whenever I edit either the app or specs. Yeay! I'll probably [push a change proposal](https://github.com/Sanjo/meteor-jasmine/pull/269) so those options can be turned on/off on demand.
Solution revisited
------------------
After searching a little more, I found out that karma already has a way of passing custom parameters to its browsers, and thanks to it the change was narrowed and moved from karma-chrome-launcher to sanjo:jasmine that was building the karma configuration file.
You can find [the patch here](https://github.com/Sanjo/meteor-jasmine/pull/269) and if using that, don't forget to remove `JASMINE_BROSER` env variable, or set it to `HiddenChrome`:
JASMINE_BROWSER=HiddenChrome
Happy coding with new "HiddenChrome" browser launcher.
final notes
-----------
It feels dirty though. I'd prefer really-hiding that window and/or using PhantomJS.
..and that was only the client-unit tests. client-integration, server-integration and server-unit yet to be tried.. I mean, if server-unit module gets fixed, it's officially said to be unstable now and it's advised to use server-integration mode instead to run them.
]]>
</script>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-2353222392640616732.post-1436030121548102682015-01-09T16:48:00.000+01:002015-08-19T15:47:42.383+02:00Quick reminder about AttachedProperties and AttachedBehaviors<script type="text/markdown">
<![CDATA[
Yet again, I produced too much text than I intended for a short response on StackOverflow.. If somebody is interested in, the question came from [StackOverflow: Dynamic Conditional Formatting in WPF](http://stackoverflow.com/questions/27863334/dynamic-conditional-formatting-in-wpf).
---
Let's start with WPF support for "Attached properties". You can literally attach them to anything you like. A `Grid.Row` is example of such attached property.
Once you create a couple of your att.props, you observe and bind them, and also bind to them as normal:
<Foo x:Name="foo" myns:MyExtension.MyAttachedProperty1="SomeSource.AProperty" />
<Bar myns:MyExtension.MyAttachedProperty2="{Binding Path=(myns:MyExtension.MyAttachedProperty1).Width, ElementName=foo }" />
note that when binding **to** an attached property, you must use parenthesis in the property path, or else the `MyExtension` class name will be treated as a source instead of as a classname-prefix. You may also need namespace prefixes (`xmlns:my=...` + `my:` everywhere)
Once you learn/master attached properties, things start to be fun! Since you can attach them and bind on almost anything, you can introduce smart extensions like:
<TextBlock m:MyEx.UnitSystem="{Binding ..}" m:MyEx.SourceValue="{Binding ..}" />
Note that `TextBlock.Text` is not bound. The idea behind it is that `SourceValue` attached property gets the raw value to be displayed, and its change handlers observe the changes to both DisplayedValue and UnitSystem, and they translate the values and set the Text on the component. Not much suprising.
But the suprising fact that is easily omitted is, that your code will be totally decoupled. You will have a source-of-value, source-of-unitsystem, both just pulled from datasource. Your calculations will just emit the value. And yet another class/file will define the plugin that handles the conversions and updates the TextBlock.
But, ok, so we have the source-value bindings attached like above. Where to put the actual code that handles the changes? Of course, you can put that right into these attached dependency properties, just attach uimetadata with a change-handler and done. Since every change-handler receives the originating DependencyObject=TextBlock, those change handlers will be able to update the Text.
But it'll get really messy once you need to observe two, three or more source properties, because every one of those will need to be really carefully tracked.
So, here's what I like to do in such cases:
<TextBlock my:MyEx.UnitSystem="{Binding ..}"
my:MyEx.SourceValue="{Binding ..}"
Text="{Binding (my:MyEx.DisplayedValue), RelativeSource={RelativeSource Self}}">
<my:AutoUnitConversion />
</TextBlock>
The AutoUnitConversion is an AttachedBehavior that after attaching to TextBox observes changes to UnitSystem and SourceValue and calculates the displayed value. The behavior could directly set the Text of the TextBlock, but then, it would be usable only with TextBlocks. So, instead, I'd make it emit the calculated value as a third property, DisplayedValue. Note the usage of RelativeSource=self, since the output attached property is set right on the parent component. In this setup, you may then directly reuse the AttachedBehavior in other places - just the final Text/Value/etc binding will change.
I think you see now how powerful it can be. It's more complex than styles, bindings and triggers, but on the other hand it allows you to literally attach any logic and any behavior. While bindings and conditions give you some sort of a language to set the rules, it's sometimes simply not enough, or sometimes it actually gets overcomplicated in XAML and writing the same logic by C# or VB is much simpler than via multibindings, converters and conditions. In such case, Behaviors are great!
..but this hint would not be complete without mentioning that Behaviors are really more complex, as you often need to set up or observe some bindings from the Code, not XAML, and also you need to remember about Attach/Detach lifecycle, and also they put some additional overhead to whole system as they introduce more properties/values to be tracked by WPF. It's really hard to tell if it overhead is higher or lower than MultiBindings and lots of Converters. I'd guess it's actually lower, but I have not measured that.
By the way, I almost forgot to add the link: [here's a very quick overview about AttachedBehaviors](http://briannoyes.net/2012/12/20/attached-behaviors-vs-attached-properties-vs-blend-behaviors/). Please note that this term is used for two things: 'quick&dirty' behaviors registerd through setting a AttachedDependencyProperty, and 'fully-fledged' behaviors registered by XAML tags ("Blend Behaviors", named after MS ExpressionBlend). They look a bit different, but the idea of operation is mostly the same. While creating and using the latter ones is a bit more involving, they tend to be structured better, easier&more reusable than the former, and they come with handy Attach/Detach methods. It's good to familiarize yourself with both types!
]]>
</script>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-2353222392640616732.post-35736975191740345762015-01-08T13:45:00.001+01:002015-08-19T15:47:24.486+02:00Random findings about EF and the way it sets up Connections and Contexts<script type="text/markdown">
<![CDATA[
Here's the long nontrimmed version of my answer, with all the babbling. If somebody is interested in, the question came from [StackOverflow: Dynamic Connection String in Entity Framework](http://stackoverflow.com/questions/27830155/dynamic-connection-string-in-entity-framework/27830508).
---
Ok, as I have not found any trace of the code I produced back then, I started digging again, trying to remember what I tried doing. I remember that finally everything almost worked but I got stuck at ... not being able to provide nor build an **instance** of EntityConnection because its connection string required three mapping files and as I used Code-First I simply didn't have them. However, in your case it should do the trick.
So.. the most important thing is of course the InternalContext. It's the brain here and it can be initialized in many ways, drived by various DbContext constructors. I first targetted the "ObjectContext" overload and tried to provide my own already-connected instance, but that proved impossible as I finally managed to get hold on a service that returns a provider that returns a factory that builds ObjectContexts ... but this factory required me to pass a DbContext. Remember that I wanted to have ObjCtx to be able to create call DbCtx ctor's overload.. Zonk.
I decompiled the factory and yet again it turned out that current implementation of DbCtx/ObjCtx and this factory that the only point where you can actually callable is inside the DbContext's constructors. The ObjectContext produced by the factory gets bound to that specific DbCtx and it's not possible to untangle them later. I remember that it's all about Connections and Metadata. It was something like "DbCtx provides connection", "DbCtx delegates all other jobs to ObjCtx", "ObjCtx provides metadata" but "ObjCtx does not know where to get metadata from", so it asks a 'service' to find it, which in turn looks up for the connection, which it gets from DbCtx..
There was also some play with IObjectContextAdapter, but I don't remember it now, except for the fact that I got to a point where I rolled in my own `ObjectContext2` class (I think I just subclassed the ObjectContext) which provided everything all by itself and didn't need that factory, but I got stuck at not being able to manually construct and initialize Metadata(Workspace?) properly.
So, that was the end of "EagerInternalConnection" path, which was the most promising one, since it did not rely on any defaults and just took everything I provided to it.
---
The last thing left was `InitializeLazyInternalContext` (you mentioned), which basically initializes everything according to defaults. That's either the true defaults related to the name of the context class, or you can provide a "nameOrConnectionString" (which I could not use since it expects EntityConnectionString and I used Code-First). Or, there's third option that takes `DbCompiledModel`. It's `protected` but when you inherit from DbContext you can call it easily.
It's a bit tricky to get an instance of DbCompiledModel though. It can't be built directly, it has to be obtained from DbModelBuilder, which in turn is often only temporarily available during your XYZContext initialization. Surely you remember `OnModelCreating(DbModelBuilder)` method where all model configurations take place. So.. but it's inside XYZContext and you need the CompiledModel before you start construction your XYZContext/DbContext again. You can refactor all the model-setup out of the context, or you may simply hack in and expose the protected OnModelCreating method even statically to be able to build models manually. However, there are few bits missing in the OnModelCreating method. There are some base policies and conventions that are set **before** this method is called, so if you'll be creating them manually.. you'd need to again decompile&find the part that sets them, and I remember it's well private/hidden so you'd need to invoke it via reflection or copy all the setup the code..
Anyways, after refactoring some code I managed to have the DbModelBuilder configured, model build, compiled, passed to new MyDbModel constructor -- and it gridnded to halt again, because of some issues I dont recall now, Sorry..
---
The factories/services I talked about. It's all around `IDbDependencyResolver` interface. Look at `System.Data.Entity.Infrastructure.DependencyResolution`. You'll find it inside. There's handy implementation of a service/instace resolver called `SingletonDependencyResolver<T>` that will be sufficient in most cases.
Registering resolvers is tricky. If I remember well, I injected the resolvers through `(same namespace)DbConfigurationManager.(static)Instance.GetConfiguration()` which returns the IoC configuration object and `.AddDependencyResolver(..)` or `.RegisterSingleton<TService>`. All internal though, so must use reflection here. I remember that the IoC and resolvers can also be configured with special entries in app.config settings, but I didn't use it as I was focused at 100% runtime configuration. Now I see that actually I didn't need that, when I wanted to inject the resolvers I could certainly do that from appconfig, as the resolver classes would not change in the runtime. You may try so instead of reflection.
Ok, so the resovlers.. There are a bunch of them you can override. I remember playing with `ProviderFactoryResolver`, `InvariantNameResolver`, and `ProviderServiceResolver`. The first one is meant to provide `DbProviderFactory` that is the root of all connection builders. The second one extracts 'invariant name' from the factory - that's in fact the "dialect" name like "sql2008". Many things are keyed/identified by this "invariant name". I don't remember much about the third `ProviderServiceResolver` one.
Look first at `DbProviderFactoryResolver`. If you register your own resolver if this type, you will be able to inject your own `DbProviderFactory`-ies. Your own `DbProviderFactory` seems crucial. Look at decompiled `EntityConnection` and its `ChangeConnection` method. It is one of the core methods that, well, sets or changes the connection string.
private void ChangeConnection(string ..blah..) {
...
DbProviderFactory fac = null;
...
fac = DbDependencyResolverExtensions.GetService<DbProviderFactory> ( ... )
...
connection = EntityConnection.GetStoreConnection(fac);
...
connstr = ..blah..["provider connection string"];
DbInterception.Dispatch.Connection.SetConnection(connection, new DbConnectionPropertyInterceptionContext<string>(..).WithValue(connstr) );
}
Yep, the `DbProviderFactory` you see there is just it. Its `CreateConnection` returns a `DbConnection` (it can be yours) and then it configured with 'provider connection string' right from the settings from the entityconnectionstring section. So, if you additionally roll in your own `DbConnection`, build by your own `DbProviderFactory`, it could then ignore the incoming connstring in favor of whatever you want.
Also, I have not investigated the `DbInterception.Dispatch.Connection.SetConnection` part because back then my play-time has run out. There is some slight chance that `Interception` is mean literally and that you will be able to "just register" an interceptor that will override the "SetConnectionString".
---
But, well.. that's enormous amount of work. Really. Using separate file to exploit `partial` to expose additional base DbContext constructor and then inheriting from the autogenerated context class and providing the connstring through that is much much simpler!
]]>
</script>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-2353222392640616732.post-40061971921841538172013-07-04T16:01:00.001+02:002015-08-19T15:47:10.637+02:00Quick review/overview of AX6002P<script type="text/markdown">
<![CDATA[
I've recently bought a AX6002P, 60V 2A power supply. I wanted to reduce the number of AC adapters offering various voltages, scrapped from various devices and a cheap USB-programmable power supply looked very promising. I've got mine from <a href="http://tme.eu">TME</a> and probably I could get it cheaper from DX or Alibaba, but I wanted to have any support just in case. Surely made in [PRC](http://en.wikipedia.org/wiki/China).
[TODO: front|back image here]
You may also notice the RS232 port near the USB.
Usage of the device is intuitive and easy, but it has some minor gotchas:
0) The device is not 60V/2A, but actually 60V/2.1A and it displays values accordingly. Not a thing to complain about :)
1) Moving the knob has no effect unless you press "Voltage/Current" button. This is reasonable from the safety point of view - I'd certainly not want to accidentally bump the voltage by +-10V! For the same purpose there's also LOCK button. This makes the former or latter safeguard a bit spurious. I actually have never used LOCK button yet due to the 3th point below.
2) By pressing "Voltage/Current" button you select either the VoltageLimit or CurrentLimit to be adjusted. The display starts blinking to give me a hint on what I am adjusting. Then with another two buttons you can pick the granularity of change ("the digit": 1.00, 0.10, 0.01 and so on) so that you can make coarse or very fine changes. It is not just a digit-by-digit setup: if you have picked Granularity of 0.01 and had a value of "1.15", then spinning the knob will allow you to go to "0.35" or "2.56" in 0.01 steps. You don't have to switchdigit-roll-switchdigit-roll. Cool!
3) But, there's no way of going out of Adjustment mode. it simply times out after a few seconds. And, after timing out, it **forgets which setting was adjusted*** and **forgets the granularity**. Now that's irritating. I press the "Voltage/Current" twice to get to AdjustCurrent mode, then press "Left" twice to pick "0.001" granularity, roll up/down a bit, **stop rolling to look at results**, and then when I turn the knob again - /dev/null - NOP - ignore - no change! The adjust mode has timed out. I have to press "Voltage/Current" twice again and set the granularity again. This is a minor issue, just UI ergonomy, but when having to perform it 15th times in a row, my, that irritates. If it times out, it definitely should remember your last 'location'.
4) The device has a switchbutton that turns the outputs on and off, accompanied with a LED. While in OutsOff mode, it displays the Voltage and Current limits. While in OutsOn mode, the red LED is on and device it displays the measured Voltage and Current. Nice and very reasonable. Although I got scared once when I got back to the room and noticed 10.00/1.500 on the display. I thought that I've burnt everything, but the outs were off and it were just the limits, not current state. You need to watch the LEDs, not just numbers ;)
5) By default, when hitting the CurrentLimit, the device will try to lower the Voltage so that the Current does not exceed the limit. Sometimes it is desired, but not always. With certain active circuits the PSU may start oscillating desperately trying to keep the voltage/current within the limits. However, there's also some delay between measurement and the correction. This means that if the load switches more quickly than the PSU can react, the PSU may make the corrections out of phase, and you may notice that the applied voltage and/or current is **actually higher than the limits set**. Of course I'd expect that to some degree as in every simple feedback system. What's very important is that the PSU displays the **actual values**. So, it will not only provide, but also display voltage and current **higher** than the limit. Good!
6) There's also a different over-limit handling mode: you can set the device into a "EmergencyCutOff" mode. IIRC, if you turn on the "OVP" button, then when the current exceeds the limit, the outputs will be immediatelly turned off. This effectively prevents any of the mentioned oscillations. However, keep in mind that in this mode your circuit may need some **soft start** enhancements. If your circuit has a power port filled with high capacity caps, then the initial spike of current drawn by just a hadful of them will very likely trigger the alarm point in an instant after you press "OutsOn" button. That's obvious, but might be a suprise for a person like me that did not have a PSU earlier.
7) The device has 4 memory buttons: M1..M4. But it has 5 LEDs that indicate which memory slot is currently used! Surprise! It took me a while to notice in the device's userguide a small note that M5 is accessible by pressing M4 and immediatelly turning the knob.
8) Now, you might wonder why speaking about 'currently active memory slot'. In the userguide they say that you can press the M1..M4 buttons to immediatelly jump to remembered setting, and that you can press the buttons for 2 seconds to store a setting. So, how a memory slot can be "active"? The memory slots hold some settings that you've remembered or programmed, right? Not. How'd you program the M5 when it does not have a button to hold for 2 seconds? :) It turns out that the M1..M5 are "last values used" slots. When you press M3 it really loads the M3's values, but then the M3 stays active and if you Adjust the values, the M3 will get immediatelly updated. There is no way to "exit" from memory slot. At any single point of time you will always be in fact editing some memory slot. But, still, it is quite usable. Maybe not what I'd strictly expect, but in overall - handy.
9) Switching between M1-M4 turns off the Outputs immediatelly. That means that if you switch from M1 to M2, **your circuit will get powered off** for a moment until you press "OutsOn/Off" button.. not good? I am not sure. One hand, I'd not want to accidentally put +60V on a 5V circuit just because I mistook M2 and M3, on the other hand, I'd really like to be able to quickly switch between 5V/0.1A and 5.2V/0.8A. Oh well. That PSU was cheap!
10) **When outputs are off, the device delivers about... -0.3V. Not zero!**. That suprised me much. I was fortunate to play with a TTL circuit that has a large margin, but I bet that someone could accidentally burn his chip if without any protection. This PSU seems to have large capacitors at its output, so maybe that "slight" negative voltage is meant to quickly discharge them. I don't know, I just guess. If you buy it, be sure to check what your unit does when OutputsAreTurnedOff.
That's all from the quick notes for now. That covers all the things I've notice about the basic usage. Most of them are not an issues at all, rest of them is really minor.
Summarizing shortly, I am currently quite satisfied with it. It didn't nuked my budget, has some issues, but all basic features work. I'd not recommend it for an intensive daily, as the frontpanel simply lacks ergonomy. But for amateur workshop, as a simple power supply where the current limit only works as a precauion - it fits just great! I just regret the 2 or three channel versions are much more expensive. Eh.
Actually I write this text as a starting point to whole set of another things. You know, that's "digging" blog, not gadget reviews ;) Most of my recent research with this device was about software, since I got irritated by the AdjustmentMode timeout and wanted to drive the PSU by USB. Anyways, that's why I paid extra for the "programmable" part. I've found many interesting things about device, its history and clones, various problems that it had, the protocol, the software etc, I'll describe it incrementally in another posts, probably heavily saturated with links.
]]>
</script>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-2353222392640616732.post-57544758933872203812013-06-25T13:56:00.000+02:002013-06-25T21:16:10.691+02:00ADATA NH92 recurring malfunction<script type="text/markdown">
<![CDATA[
Few days ago I've received from a friend an external HDD drive that has gone wild yet again. Previously it happened about half a year ago, and it now seems to have established its frequency. He uses it as a backup device for most of his personal archives and he handles the drive with extra care: no physical damage, always safely unmounting, and so on. The disk is then held in a box on the shelf (no nearby magnets:) ) and after some time.. it just refuses to work.
Actually, the disk seems to work, but WindowsXP refuses to see its partitions and data and asks to 'format the disk before use'. Obviously not a thing to do on a archives ;)
The damage and quick fix
----------------------------------
The drive in question is ADATA NH92, external USB case, USB 2.0, inside sits a 500G 5400 rpm, 8Mb cache. It is set up with a huge FAT32 partition. Not very safe for long term archives, so the first time I heard about it and got the drive I suspected the worst.
I've examined the drive's contents as thoroughly as possible and actually the file system was not damaged at all. Both FAT copies were identical, all directories were in perfect condition, nothing trucated, all files were readable, not crosslinked and seemed OK.
The only thing that was damaged each time was the BootSector. Curiously, it was overwritten not with random trash but it got almost zeroed'out. Almost, because there was some apparently structured short data at the beginning of the block.
I suspected that some antivirus or cleanup utility tried to help my friend to recover the disk and failed, but the data block did not look like anything that could be a reasonale BootSector, so a virus, maybe? But immediately after the damaged BootSector a copy of it was left intact, so that would have to be a very friendly virus..
Restoring the BS from its backup immediately revived the disk and Windows stopped complaining about formatting and displayed the contents. At first I've done it manually, but it has happened already three or four times so later I've been using the TestDisk (http://www.cgsecurity.org/wiki/TestDisk). With it, restoring BS from its backup is really quick, in menu look into 'Advanced' and then 'Boot'. If you are going to use it, keep in mind that you should first check what is damaged. Your drive may have different problems.
Problem and partial diagnosis
----------------------------------
Unfortunatelly, I was all too happy about reviving the drive and I've not preserved backup copies from all the earlier incidents to compare what was actually written to the BS, but I remembered one thing: the whole sector was almost zeroed-out and contained a few bytes of data with a "USBC" string.
The current fault has again looked like this, here's the dump:
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00007E00 55 53 42 43 40 C4 45 89 00 04 00 00 00 00 0A 2A USBC@ÄE........*
00007E10 00 00 00 00 3F 00 00 02 00 00 00 00 00 00 00 6A ....?..........j
00007E20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00007E30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
........ .... zeroes ...
00007FC0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00007FD0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00007FE0 00 00 00 00 72 72 41 61 A1 D2 E8 00 03 00 00 00 ....rrAa.Òè.....
As you see, that's nothing like random trash or a middle of binary file. It looks like some catch-eye magic string, some header, and then a series of a few small integers. So, I hunted after the USBC magic string and it turned out to be ...
<a href="http://wiki.livedoor.jp/h8h8h8/d/USB%20Memo">Wait. What? SCSI over USB packet?!</a>
The <a href="http://www.usb.org/developers/devclass_docs/usbmassbulk_10.pdf">docs for USB Mass Storage Bulk Transfer, page 13</a> has confirmed: the 0x55534243 is the signature of a CBW packet, which is USB wrapper for an SCSI command. Dissecting the data visible above we get:
0x55534243 ('USBC') - dCBWSignature, the CBW identifier
0x40C44589 - dCBWTag, a passthrough echo
0x00040000 - dCBWDataTransferLength, bulk transfer bytes [256kB]
0x00 - bmCBWFlags [dir = 0: from host to device]
0x00 - bCBWLUN [device = 0]
0x0A - bCBWCBLength, wrapped command length [10B]
and following 10 bytes are the 'CBWCB', the original message to the device that was wrapped.
The disk is internally a simple 2.5" drive, so - the payload is <a href="http://en.wikipedia.org/wiki/SCSI_Write_Commands#Write_.2810.29">just a SCSI write command</a>:
0x2A - operation code "write at LBA(10)"
0x00 - flags: WRPROTECT=0, DPO=0, FUA=0, FUA_NV=0
0x0000003F - LBA address: 0x3F [!]
0x00 - group number=0
0x0002 - transfer length: 2 blocks
0x00 - control byte
Now look at the LBA in that command.
The BootSector that was damaged was at offset 0x7E00 so at sector 0x3E.
The 0x3F sector is the BackupBootSector.
This write command was meant to write at sector 0x3F.
This write command contains 'transfer length' of two blocks.
The BootSector and its copy should be always exactly the same, so if the Backup were to be written, then probably the normal BootSector was meant to be refreshed too. That means that the BootSector was not overwritten accidentaly. The BootSector was meant to be updated, and then, probably immediately, the BackupBootSector was meant to be updated too.
But how could the whole write command get written to the drive, even with its USB wrapper, instead of being executed? And in such way that no other sectors were damaged?
I think that the answer lies somewhere in the fact how BulkTrasfer works. Looking at raw data stream, it's be something like this:
... | command#0 | bulk data for command#0 | command#1 | bulk data for command#1 | command#2 | ...
Command#0 would be "write-a-BootSector" and command#1 would be "write-a-BackupBootSector". Transfers are performed in larger blocks and commands are short, so each command is padded to fill at least whole block. That way, the drive's controller may just read block-by-block and either read it as a command, or pass it further as block of data to process. To know how many blocks to pass, each command holds information on how long its attached bulk data is.
Now, consider what would happen if for some reason the bulk data for first command is missing. The controller inside the external drive would get:
... | command#0 | command#1 | bulk data for command#1 | command#2 | ...
It would fetch the command#0, read it as "write-a-block-at-BootSector" and then it would treat the next blocks as the anticipated bulk data to be written ... and it would consume the immediate next command as data and leave the bulkdata#1 unread. Then, the bulkdata#1 would be consumed as a next command. In an optimistic case, it would fail and everything would get out of sync and the communication would probably be discarded and reinitialized. In pessimistic case, the bulkdata may look like a command and even further damage could occur.
Currently, I do not know if such out-of-sync errors are reported anywhere and how to check for them. I also have no idea how a block of bulk data could evaporate. But, for me, it just looks like it did. It certainly did not evaporate at random, since this problem occurred many times with this external drive. Moreover, if it happened at random, it would likely happen all over the place, not just at the first BootSector!
Ok, so if not at random, let's consider the second extreme: a systematic error.
Writing to the BootSector and to the BackupBootSector are most probably performed in exactly the same way, just the LBA address is 3E or 3F. Now consider that both operations have their bulk data systematically discarded:
... | command#0 | command#1 | command#2 | ...
That would simply write the "command#1" as the new BootSector, leave the BackupBootSector as-is, and happily proceed with the next command. That way absolutely no errors would be observed. Considering that the operating system is glad that everything went well, the disk would operate normally until it is unplugged and then in the next day it would be dead. Or, if the system flushed and reloaded the disk's configuration, it would immediatelly drop dead.
Disclaimer
----------------------------------
I've found the USBC marker, I've analyzed the data, it matches the CBW/SCSI command. All the rest is just my guessing.
For me, it seems like some bulkdata were discarded, but I do not know whever it was the WindowsXP's faulty DeviceDriver, the USB chips on the mainboard, or the USB->drive adapter that sits inside ADATA NH92 aluminium case, or maybe the drive's internal electronics. Well, actually we can cross out the drive's electronics, since under normal operations the adapter should translate USB/CBW/SCSI message into just SCSI message to be sent to the drive. This leaves the driver or the adapter.
Therefore, the next thing I'll suggest to the friend of mine is to buy some new USB adapter with a new case not NH92 and even better not from ADATA, put the disk into it and observe how it behaves. If there will be any similar fault again, that would mean that either the mainboard's USB controller or the WindowsXP's USB MassStorage device driver are faulty.
However, I doubt. I am quite sure that the problem is in the USB adapter that comes with the ADATA NH92. My friend uses many other USB storage devices like pendrives or cameras that expose their internal memory as a usb storage device, and the problem occurs only with that single ADATA device.
Scale of the problem
----------------------------------
ADATA NH92 seems to have problems. Searching over the internet I've quickly found many complaints about loosing data, for example <a href="http://www.chip.pl/testy/pamieci-masowe/dyski-twarde-zewnetrzne-25-cala/a-data-nobility-nh92-anh92-500gu-csv-500gb">see the review's comments here</a>. This one is in Polish only, but lots of similar can be found. Many claim that the disk worker properly and at some point of time it just "died" and asked to be formatted. Some even attempt to diagnose, and they find that i.e. BootSectors were overwritten with USBC marks :P
But, there are more serious cases. For example <a href="http://superuser.com/a/585663/233273">here at SuperUser</a>, a user of this drive describes that he found out many more sectors were overwritten with USBC marks.
As I recalled that at first I thought that maybe some virus has damaged the drive, I've searched for the "USBC" in a bit broader sense. I've found many, really many posts on various forums that complained about some "USBC virus" that would disable the drives or damage the data. Most of them could be summarized as follows:
- drive wants to be formatted, critical sectors were overwritten with 'USBC'
- contents of a directory has evaporated, a large single USBCxxxx file showed up (where xxxx is some strange characters)
- a file got damaged, its contents were overwritten with "USBC" at some point
Michal's post and search results worry me greatly.
First, in his ADATA NH92 the damage scheme was similar, but occurred at random locations. I'm currently running a low level search over the whole drive of my friend for USBC marks, but nothing was found yet. Maybe this drive was lucky or maybe the Michal's from Stackoverflow was not.
Second, the internet fora indicate that this problem obviously occurs not only in the ADATA NH92. People have reported the same problem with other external drives and card readers. This might indicate some serious problem in a whole batch of USB hardware controllers used in cheap adapters (quite probable!) or even those more expensive ones used in mainboards (unlikely) or a hideous bug in the device driver.
I'd say that under these observations the 'device driver' options is quite reasonable, but I think I've seen posts about "USBC" problems written by people using Linux, so it's hard to put all the blame on WindowsXP's drivers ;)
FYI: ADATA NH92 adapter chipset is Moai MA6116F6 422A-1035 MAN09088.1 backed by 24C02 eeprom.
tl;dr
----------------------------------
I'm no expert. If you have similar problems, I suggest you to first consult someone who is. If you cannot, or cannot aford, then you might do the same as I told my friend: buy another USB adapter with plastic or metal case. Just an adapter and a case, you don't need a new disk. Then move the disk from the old case (i.e. NH92) into the new one, and put your old adapter aside. If the problem never repeats for some reasonable time - trash the old adapter and be happy. If the problem repeats - that's mainboard or your operating system, fix that and you can put the old adapter to use with some other disk - or it is the new adapter having the same fault. Maybe try with yet another one? Still cheaper to test than to buy or repair a mainboard..
Final note
----------------------------------
I am quite convinced that this is the adapter's fault, but I may be wrong. If you know more about how/why the command got written to the disk instead of data, drop me a note or link!
]]>
</script>Unknownnoreply@blogger.com8tag:blogger.com,1999:blog-2353222392640616732.post-56845288904090094372012-10-04T16:30:00.000+02:002013-06-25T13:59:45.289+02:00Notes on Symbian, Qt, QJson, DLLs and Capabilities<script type="text/markdown">
<![CDATA[
I'm working on some small application on Symbian-Belle platform. Everything was doing well until I've hit a place where I had to parse a few JSON files. I've found a QJson library, which seemed very relevant and easy to use (although it seems to return everything as QVariantMap, which probably is not that lightweight..).
This library is however not provided in a precompiled form. It is source-only, but the source is openly available at http://gitorious.org/qjson/qjson/. Fortunatelly, it is prepared to compile under QtCreator's Symbian projects.
Initial project setups
----------------------------------
After fetching the library from Git, I've tried compiling it and it went well. I wanted the lib to compile along with the main project. I've created a top-level SUBDIRS project and put QJson underneath, and also I added there the project with my application. I've set all inter-project dependencies and corrected the build and run configuration to actually run my application (SUBDIRS top-level project does not set it by default, it has to be manually corrected).
I've chosen the simulator configuration, pressed Run, compilation passed, but linking did not:
Error: file 'qjsond.lib' was not found
The default QtCreator's dependency management added the lines:
win32:CONFIG(release, debug|release): LIBS += -L$$OUT_PWD/../QJson/lib/ -lqjson
else:win32:CONFIG(debug, debug|release): LIBS += -L$$OUT_PWD/../QJson/lib/ -lqjsond
else:symbian: LIBS += -lqjson
else:unix: LIBS += -L$$OUT_PWD/../QJson/lib/ -lqjson
and the QJson library has generated only 'qjson' binary. I am running on Windows, so building for the Simulator looked for the 'qjsond' symbol, just as the second line clearly states. BTW. this is the defacto standard way of naming debug versions of libraries on win32 introduced by Microsoft's VisualStudio series.. Anyways, here's not relevant. The QJson, even in debug-mode, builds only without the -d prefix, so I removed it:
win32:CONFIG(release, debug|release): LIBS += -L$$OUT_PWD/../QJson/lib/ -lqjson
else:win32:CONFIG(debug, debug|release): LIBS += -L$$OUT_PWD/../QJson/lib/ -lqjson
else:symbian: LIBS += -lqjson
else:unix: LIBS += -L$$OUT_PWD/../QJson/lib/ -lqjson
Everything compiled well, linked well, so I run it in the Simulator. And it run! yay.
And on the device..
----------------------------------
So, the next part was to run it on the real device - and here another class of problems began. After building, linking, deploying and installing, the IDE told me:
Launch failed: Command answer [command error], 1 values(s) to request: 'C|4|Processes|start|""|"MyApp.exe"|[""]|[]|true'
#0 {"Code":-46,Format="Failed to create the process (verify that the executable and all required DLLs have been transferred and that the process is not already running) (permission denied)"}
Error: 'Failed to create the process (verify that the executable and all required DLLs have been transferred and that the process is not already running) (permission denied)' Code: -46
and the launch process instantly was aborted. Of course the application was launching properly back when the json-related code and library was not yet added to the project..
I've checked and the QJson library was indeed properly installed on the device, so it was not about missing files. Also, the application it self has been properly installed too. Anyways, in the parentheses there's a note `permission denied` which was quite suspicious. When I tried to run the application manually on the device, I've got:
> `Dostęp do aplikacji został zablokowany przez administratora`
in english, I'd be something like:
> `Access to the application has been blocked by the administrator`
actually, on devices with english language set, the message is:
> `Unable to execute file for security reasons`
Well, this was not what I expected when adding a DLL. I expected a crash, SIGSEGV, Kernel-Panic-#32 etc, but not some security/administrative restrictions. I've searched a bit on Nokia'a documentation and I've found a page that mentioned that DLL projects also can have **Capabilities** specified, just like normal .EXE applications.
QJson's DLL Capabilities and UID3
---------------------------------
Actually, bad Capability definitions could have caused that error, so I've checked the .pro files from the library project. In 'src.pro' I've found lines:
#TARGET.UID3 =
TARGET.CAPABILITY = ReadDeviceData WriteDeviceData
From elsewhere I knew that those capabilities require the application to pass the Symbian-Signed process. Currently my app was just self-signed, as this is the QtCreator's default setting. I didn't want to setup the full signing process yet and my application does not require those capabilities at all, so I commented-out them, rebuild, and it didn't help at all :)
Even worse, after next few attempts, I started getting the same error as in here http://stackoverflow.com/questions/12705282/use-qjson-in-my-qt-symbain-app:
error: Installation failed: 'Failed to overwrite file owned by another package: c:\sys\bin\qjson.dll in (....)
Then I remembered that near those capabs in the qjson library's project file, there was a UID3 commented out! So, probably the build process has been using some temporary or random one, and it has changed! The installer cannot verify that I am in fact updating an old version, and thinks that some new incoming .sis file tries to overwrite a file that does not belong to it.
I've manually uninstalled the old package, updated the qjson project with
TARGET.UID3 = 0xE0123456
then rebuild and the installation succeeded. However, still it could not launch due to the security error, even with the spurious capabilities removed.
DLL Capability explanation
--------------------------
Here's an article that is a **must-read** if you are ever going to use DLLs on the newest versions of Symbian platform. You see, when I last touched the subject, there was no such thing as 'capabilities' on Symbian, and the other platforms like WP7 define them a bit differently. Anyways, the article is http://www.developer.nokia.com/Community/Wiki/Shared_Library_DLLs_on_Qt_for_Symbian
Inside there's a small section named http://www.developer.nokia.com/Community/Wiki/Shared_Library_DLLs_on_Qt_for_Symbian#Capabilities Set platform security capabilities. In this section a very important phrase is seen:
> `Capabilities are treated slightly differently for EXES and DLLs. In essence, while an EXE must be granted the capabilities required to call the APIs that it uses (or which are used by its loaded DLLs), a DLL must be given all the capabilities of all the EXEs that might need to load it. For a DLL used by a single application exe this would be the same as the application's capabilities. If however the DLL is to be used by arbitrary clients, you will need to give it as many capabilities as possible.`
and also, far later:
> `Note that the application .pro file should also specify all the capabilities that the EXE needs to use any protected API that it calls or the DLL calls - the DLL should have all the capabilities specified in the EXE (and it may have more).`
What does it mean? In EXEs the Capabilities are specifying what that EXE is allowed to try to use. If an EXE tries to read the device IDs, it needs ReadDeviceData, or else an error will be rised. The DLLs in turn are always using the capabs that the calling EXE provides - otherwise it would be dumb or useless in terms of security. Thus, specifying Capabilities on a DLL project **does not provide** any access rights to the DLL. If a DLL reads device IDs, then the calling EXE must have the Capabilities specified..
However, the Capabilities are also used in DLLs projects, but their meaning is drastically changed: now they specify in **what contexts the DLL is safe to be used**. They are a means of describing the DLL safety. For example, if a DLL is marked by Capability ReadDeviceData, it could mean for example that it has been checked to not peek some user-critical data and send them over the internet, and therefore this DLL _is safe to be used by applications that have ReadDeviceData_ flag..
All those facts taken together create a simple security rule:
If your application has some Capabilities specified, then all DLLs that are loaded by the application must have at least all those Capabilities.
In the example of my project, my application had specified 'NetworkServices', and the QJson library by default has 'ReadDeviceData WriteDeviceData'. It is a complete mismatch, and when the OS tried to run my application, it discovered that my "network-related application" tried to load a DLL that is "not safe" in terms of networking, so it halted the process..
Solution
--------------
The solution is very simple. I've just had to leave my application project as it was, with no changes, and only add the network capability to the QJson's project (to mark it as safe-to-use-with-network):
TARGET.UID3 = 0xE0123456
TARGET.CAPABILITY = ReadDeviceData WriteDeviceData NetworkServices
Please note that I even did not have to remove the *DeviceData capabilities: my application does not have them.
Better fix?
--------------
According to the Nokia's document and to simple reasoning, the DLLs should have the widest possible Capabilities assigned, especially if they are to be by different projects for different applications.
Thus, I cannot fanthom why the QJson library comes only with *DeviceData capabilities. Actually, JSON format is heavily related to networking, so it is quite obvious that the application that use this library will have NetworkServices capability!
I'll file a patch to the project with more capabilities added. In the meantime, please remember that if you use a DLL project, you must ensure that the DLL capabs are at least as wide as your EXE's.
]]>
</script>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-2353222392640616732.post-7031189493229094162012-06-25T23:28:00.000+02:002013-06-25T13:57:09.202+02:00Random thoughts on Moq and incremental assertions<script type="text/markdown">
<![CDATA[
This text relates to my earlier proposal of mock.Count method: https://github.com/Moq/moq4/pull/17#issuecomment-6538764
I publish this text here just not to spam the issue tracker with many pages of loose thoughts.
I think that maybe I expressed myself not too clear. I was not saying that Reset is bad/useless/etc in a general manner. I meant that:
- Reset() -- resets whole log - is easy to use
- Reset(filter) -- removes selected entries - has its pros, but despite being similar to Verify, it is much harder (than Verify) to use properly in any non-trivial cases, and in some case, may even be useless while cumulative Verifys would work
- .Count(filter) -- is not as "clean" as Reset(filter) due to needed local variables, but does not have the drawbacks that make Reset(filter) possibly hard to use
If you wonder on drawbacks: it is the state mutation, performed with "greedy" filter.
Facts:
- Verify and Count works intuitively, before they doesn't change anything, and just check or summarize the current state.
- Reset on the other hand modifies the current observation state.
- Filters are built of two types: precise `It.Is<X>( x => spec-of-x )`, and range-wise: `It.IsAny<X>()`
- IIRC, even the precise filter can be made range'd, for example with spec: `val >= 15`
Imagine a case:
interface IManager
{
void Register(int priority, int itemType);
}
class Tested
{
...... ctor, IManager mgr variable, ...
public void DoWhizz(..params, params..)
{
...blah...
mgr.Register( 0, 20 );
...blah...
mgr.Register( 0, 20 );
...blah...
mgr.Register( 5, 0 );
...blah...
mgr.Register( 5, 20 );
...blah...
mgr.Register( 5, 20 );
...blah...
}
}
void TheTest()
{
Mock<IManager> mock = .....;
var obj = new Tested(mock);
obj.DoWhizz(.....);
// // // mock.Verify(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) ), Times.Exact( 4 ) ); // A, would pass
mock.Verify(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() ) , Times.AtLeast( 3 ) ); // A, passes
mock.Reset(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) )); // B
obj.DoWhizz(.....);
mock.Verify(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) ), Times.Exact( 4 ) ); // C, passes
mock.Verify(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() ) , Times.AtLeast( 6 ) ); // D, FAILS
}
This test was written with some black-boxy assumptions:
- during all DoWhizz calls, it should Register at least 3 items with high priority
- during the very first call, we do not care about the item types, as it may for example, reregister some historical things
- during later calls, we want exactly 4 new items of "higher type" to be processed
Line tagged as (A) is obvious. At this point of time the `val>15` is ignored, but if I knew that there is no history/etc, I could use it and it would pass.
In line (C) for some reason, I wanted to not accumulate expectations and write 8, for example, for test mainterance reasons.
So, in line (B), I resetted the counter for the calls in question. Maybe I was new to that, or maybe I did not think enough about it, or maybe just my actual case was a bit more complex, the issue nested deeper and harder to trace - whatever. I used the same filter I will use later in C, and also it is all the same filter I'd use in (A) if I could/knew - it "felt obvious and natural".
The effect is that the under-specified reset filter in line B has implicitely mutated counters not only for the first, but also for the second issue that this test checked. I resetted counts for second param, and the count for first param has implicitly changed, and at point (D) it will be now 5, not 6. This is obvious when you look at implementation of the Tested class, and if you know exactly how the invocation log was 'recorded'. **Elsewise, this is not so obvious.**
To perform **proper** reset in this case, the reset would have to look like:
mock.Reset(o => o.Bar( It.Is<int>(val => val <= 3), It.Is<int>(val => val > 15) ));
and this starts to present the possible complexity of having to **manually slice the parameter domain** into disjoint regions to be droped. With integers and 1 or 2 params this is easy, but later it will start to be a tough piece to mantain.
Assuming you dont want to analyze and slice the domain, you may try just to "intuitively" reset for the other paremeter too:
mock.Reset(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) ));
mock.Reset(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() ));
Right? Not!
First note: what if there were more parameters? Add more resets?
Second note: Actually, with each such line, you would introduce **more uncertainity**, because the second line has its second param open. Such two lines aggregate to:
mock.Reset(o => o.Bar( It.IsAny<int>(), It.IsAny<int>() ));
so, actually, they are unfiltered reset that clears everything related to that method. If it has helped in our case, then we did not need the filter at all. If it did not help - then the reset must be specified properly by slicing the domain and praying that we have few, separable parameters:
mock.Reset(o => o.Bar( It.Is<int>(val => val <= 3), It.Is<int>(val => val > 15) )); /// <--- the only safe and proper way
So, while it is useful feature, in my view, in most nontrivial cases, some unpleasant cases show up:
- **case1:** those not fluent in the topic, will "intuitively" expect that line (D) will have the count of 6 - because they have reset/filtered the OTHER parameter
- **case2:** those more experienced, will know that clash can happen, and will manage to avoid it, but they will be unable to guess what count they should expect at (D), because the DoWhizz's will be a black box
- **case3:** the parameters will be too entangled, or clash will be deep enough to make the parameter domain too hard to slice properly, thus making it impossible to determine the count at (D) after the partial reset
As I already said, there's an 'oops' that with Verify & Times, the writer has to specify the **absolute bounds**, and is unable to express **relative bounds**. The addition of Reset did not help - it solved one assertion by resetting it to zero, but has damaged the other one by making the absolute value hard to determine -- because there is **no way to learn** the number of invocations other than asserting it Verify & Times and manually reading the exception text.
If Count were added, the case is trivial to fix:
void TheTest()
{
Mock<IFoo> mock = .....;
var obj = new Tested(mock);
obj.DoWhizz(.....);
mock.Verify(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() ) , Times.AtLeast( 3 ) );
mock.Reset(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) )); // <--- damages some counts
int cnt = mock.Count(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() )); // just CHECK them
obj.DoWhizz(.....);
mock.Verify(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) ), Times.Exact( 4 ) );
mock.Verify(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() ) , Times.AtLeast( cnt+3 ) ); // and USE them
}
Instead of analyzing what and how to reset and writing several resets to properly clear the parameter domain chunks, I just read the current value and use it in assertion. In my view, it is as short, as easy to comprehend and also as readable as it can be.
The same test with Count only is in my point of view more straightforward, because you dont have to think what hidden sideeffects could the Reset have:
void TheTest()
{
Mock<IFoo> mock = .....;
var obj = new Tested(mock);
obj.DoWhizz(.....);
mock.Verify(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() ) , Times.AtLeast( 3 ) );
int cnt1 = mock.Count(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) ));
int cnt2 = mock.Count(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() ));
obj.DoWhizz(.....);
mock.Verify(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) ), Times.Exact( cnt1 + 4 ) );
mock.Verify(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() ) , Times.AtLeast( cnt2 + 3 ) );
}
I hope I managed to show you that:
- Multiple subsequent calls to Count are orthogonal, while calls to Reset are not
- addition of Count overlaps with Reset only partially
- addition of Count helps to deal with some problems of Reset
- Count does not have the problems of reset
- compared to Reset, the Count forces you to add similar numer of "extra lines", usually less (no params slicing)
I wrote a tome, so I'll add a few words more :)
The last example shows that Count is used like Verify with some sort of initial checkpoints. It is a little work to write cnt+4, and it may look ugly for some, so I also suggested Times.Add() method for some sugar.
Actually, what I'd like to see is:
----------------------------------
void TheTest()
{
Mock<IFoo> mock = .....;
var obj = new Tested(mock);
obj.DoWhizz(.....);
// var mark1 = mock.Mark(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) ) ); // either that
var mark1 = mock.Verify(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) ), Times.AcceptAny() ); // or that
var mark2 = mock.Verify(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() ), Times.AtLeast( 3 ) );
obj.DoWhizz(.....);
mark1.VerifyNext( Times.Exact( 4 ) ); // no more filter copying just to verify the same thing
mark2.VerifyNext( Times.AtLeast( 3 ) );
mark2.Verify( Times.AtMost( 3 ) ); // why not have a Verify and VerifyNext here?
// below - contrived features :)
Assert.True( mark2.Count % 3 == 0 ); // Count returns current last count stored by marker, if the Times is not enough
mark2.Count = 15; // marks are meant to be simple. You can do that if you know better
int cnt = mock.Count( mark1.Observed ); // you can always check the most recent without bumping the markers
var markX = mock.Verify( mark1.Observed, Times.AtMost( 5 ) ); // marker remebers the filter? coooool
}
Currently, verify now returns void. This is a waste. This is a great handy way to provide simple checkpointing that could increase readability or offer new sly features. The API change is minimal: Verify works as always, throws when Times are not satisfied. But when succeeds, returns a marker with some information on the last state of the observation.
However, you'd get a marker only when Verifying something. How about if you just want to have a new marker without checking anything? It would need a new factory method, or can be solved just by adding a pass-all option to times solves it.
The Mark/Checkpoint object may be a simple struct that just remembers the Lambda and the Count. The VerifyNext would update the internal counter, so several VerifyNext would behave intuitively. The Count property is only used as a base offset for "next checks", it does not have anything to do with actual invocation history log - it just offsets the Times during VerifyNext. The Observed property (or Filter, or Condition, name is not so important), carries the original expression provided to Verify. The whole Mark is really a Mark<TExpr> and carries 100% of the initial information about what the writer wanted to observe. This makes making similar assertions much easier.
With such thingies, the example from earlier parts would boil down to:
void TheTest()
{
Mock<IFoo> mock = .....;
var obj = new Tested(mock);
obj.DoWhizz(.....);
var mark1 = mock.Verify(o => o.Bar( It.IsAny<int>(), It.Is<int>(val => val > 15) ), Times.Any() );
var mark2 = mock.Verify(o => o.Bar( It.Is<int>(val => val > 3), It.IsAny<int>() ), Times.AtLeast( 3 ) );
obj.DoWhizz(.....);
mark1.VerifyNext( Times.Exact( 4 ) );
mark2.VerifyNext( Times.AtLeast( 3 ) );
}
which in my opinion is far superior to addition of .Reset and .Count
There is no rush, and I am aware that most of the people does not do any "incrementality" in their test, they probably just multiplicate the test cases five times and test things separately. I do not negate this approach: sometimes this is good and desired. However, when you have nontrivial setups to perform, you actually want to either reuse as much as possible (even the assertion builders), or to check as much as possible at the cutpoints that you just meticuously crafted.
Addition of .Count is ready and is an instant. Change is trivial and provides similar features to 'marks', just with less syntax sugar. By throwing the Reset in, we add more sugar, but Count is still needed.
When I get another few days off from work, but if I will try to write it, and probably post it as separate proposal.
]]>
</script>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-2353222392640616732.post-34061966281348638702012-05-14T01:16:00.002+02:002012-05-14T01:33:18.409+02:00Some flashbacks on WP7 Pivot/Panorama/WebBrowser problems<script type="text/markdown">
<![CDATA[
Completely by accident, I just stumbled upon:
http://www.scottlogic.co.uk/blog/colin/2011/11/suppressing-zoom-and-scroll-interactions-in-the-windows-phone-7-browser-control/
Colin - Thanks for credit! And for talking more on the topic. I've never had enough time to describe a more-full solution than that one lenghy post. A user asked me recently about the code, and, of course, I shamefully still lack the time, so I'll give him link to your article.
If somebody is interested in, my original post Colin refers to is: http://stackoverflow.com/a/7347448/717732 and I also recommend to you to read another one http://stackoverflow.com/a/7391698/717732 as it also contains few bits.
At that time, I wrote quite a few posts related to WebBrowser and Pivot/Panorama on WP7, but that one is the most complete - but still, it is not complete.
I really regret I have not written in full about it at that time. I remember I did a lot of research on what is and what is not possible, and how to achieve useful effects. I realize that even at 7.5 it is still an issue, but unfortunatelly, I cannot currently provide you with my original code and ready-to-use components, because .. it is a part of published application ('OnetNews') and the last time I spoke about it, its owner was too happy about such idea of sharing the code..
BTW. I've just remembered a very important note: the WebBrowser control on WP7.0/7.1/7.5 seems to have a **serious** memory leak, that manifests rarely, but (almost) deterministically if your application meets certain criteria. For example, I've noticed that one of such case is .. putting few WebBrowsers across a few Panorama pages and dynamically changing their URI/Source upon swiping the Panorama's pages. I failed at pinpointing what exactly is the problem. It seems to be some race condition that causes loss of even few megabytes of memory per wipe-and-uri-reload. For a medium-sized application, it means that your application may crash afrer 14-30 page swipes. Also, be careful with images you place or link to - I remember I could easily whole application by navigating the WebBrowser to a page that displayed some large photos (frankly, all images seemed to be GC'ed properly by the WebBrowser - if only the app didn't crash).
If I manage to find time and my old notes, I'll try to post more info, but again I can't promise when it will be :|
]]>
</script>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-2353222392640616732.post-7477769538015871252012-04-16T04:07:00.000+02:002012-04-16T04:09:33.377+02:00VisualStudio 2010 registry chaos<script type="text/markdown">
<![CDATA[
Actually, not a chaos, but it is a catchy word..
The case started with a short note I've received from EmperorXLII:
<!-- language: lang-none -->
> I wanted to let you know that the new .vsix does not support
> command-line execution through MSTest.exe of test assemblies.
>
> In other words, from a VS command prompt,
> mstest /testcontainer:MyTest.dll
> reports "No tests to execute".
>
> I still had the .reg file from the previous version; adding
> that to the registry fixes the mstest issue.
I immediatelly thought about two registry entries I intentionally skipped in the newest version, but they were really irrelevant. What really suprised me is that after running old .reg file - it started working. It simply could not.
If it started working when he used the old reg file, it means that he probably accidentially activated an version older than 2.1 (older than the one with 'installer'). The MSTest most probably loaded some old modules from the 1.2 or 2.0 version, not deleted and still sitting in the /PrivateAssemblies.
After what I've traced in the last few days, I'm quite sure about that!
<a name='more'></a>
The problem
-----------
There's a lot of smoke out there when you try to find out which and how the VisualStudio actually reads the registry keys. If you look at it's modules, almost all of them refer only to HKLM hive, but when you check it at runtime - it turns out that HKCU are read in their stead.
For example, looking at TestTypes registration in HKLM, there are two types:
- {13cdc9d9-ddb5-4fa4-a97d-d965ccfc6d4b} - UnitTest TIP
- {ec4800e8-40e5-4ab3-8510-b8bf29b1904d} - Ordered AutoSuite TIP
and if cheking in HKCU, assuming the xvsr10 is installed:
- {13cdc9d9-ddb5-4fa4-a97d-d965ccfc6d4b} - UnitTest TIP
- {4d2f9ccb-49d3-4caa-9a29-beffc604075a} - XUnitTestTip
- {ec4800e8-40e5-4ab3-8510-b8bf29b1904d} - Ordered AutoSuite TIP
As probably you guessed from the foreword, the commandline MSTest read the HKLM, while the VisualStudio reads HKCU entries. The funny thing is, that neither the MSTest, nor the VisualStudio cares about that. Both of them delegate all the tasks to the QualityTools package. They both delegate to the very same assembly.
So how does it happen, that the same code from the same package, switches back and forth between registry hives?
VisualStudio named instances
----------------------------
If you have ever tries the VS Extensibility SDK, created a tutorial-ish VSIX and ever tries to debug it, you have probably seen a "Visual Studio Experimental Instance". Shortly: this is the very same VS that you use normally, but it is artificially switched to sibling registry root key, by a commandline switch: `/rootsuffix Exp`
This switch causes the VisualStudio to stop using:
HKCU\Software\Microsoft\VisualStudio\10.0
and start using:
HKCU\Software\Microsoft\VisualStudio\10.0Exp
as its root registry key for keeping most of its configuration. In terms of VSIX, this allows to temporarily install your extension-under-development without danger of bricking your main VisualStudio instance.
However, if you look there, you will also find:
HKCU\Software\Microsoft\VisualStudio\10.0_Config
and you may also find:
HKCU\Software\Microsoft\VisualStudio\10.0Exp_Config
and if you look inside them, their contents will look, well, quite similar to those without _Config suffix.
And yet there's also a HKLM version.
So what all that for?
The 'Chaos'
-----------
It turns out that the VisualStudio's packages use a layered registry architecture:
- HKLM\Software\Microsoft\VisualStudio\10.0 is where the original information is installed once, per-machine
- HKCU\Software\Microsoft\VisualStudio\10.0{suffix} is the per-user configuration that shadows the HKLM, so that Alice and Bob can have thei own settings, and moreover, so they both can run their 'experimental' instances
- HKCU\Software\Microsoft\VisualStudio\10.0{suffix}_Config is the 'current' user's instance configuration, that shadows the original user's settings, and is easily removable in case something badly fails and needs to be reverted
So, if some required value is not found in hkcu-{suffix}_Config, it is searched for in hkcu-{suffix}, and then, finally, in hklm.
Of course, the user configurations are really meant per-user, so when Alice installs an extension - it's setup gets written to her HKCU registry settings, and Bob's registry is untouched. He will not have that extension activated. This is a great thing! User isolation, licensing and other blahs.
But let's look back at the VisualStudio package development. We want to provide an extension to the VS. Do we really have to support this registry chain?
Well, I actually don't have to use the registry at all, at least in terms of the xvsr10 - the `[RegisterTestTypeNoEditor]` attribute and CreatePkgDef tool from SDK actually have done all the work for me. However, if I wanted to manually keep something in the registry - then yes, I'd suppose I have to implement the chain.
And I'd be completely wrong.
I turns out that the VisualStudio environment provides several set of ways to access the registry, most notably the `class Microsoft.VisualStudio.Settings.ExternalSettingsManager`, but let's leave it aside for now. The interesting part is that probably almost no package loaded into VS uses the HKCU at all!
For example, the QualityTools package uses `TestTypeInfoCollection`, `TestConfigHelper`, and `TestConfigKey` classes to read the list of TIPs from the registry, and they all very firmly point to the "LocalMachine" keys. Of course, the leaf `TestConfigKey` is a wrapper for framework's `RegistryKey` class, so no big deal - the wrapper handles the switching? Not! This wrapper is dumb. It is a little more than lazy-loading support.
Before you start wondering - wait, there's more! If the plugin/extension you wrote tries to execute similar code:
var lm = Microsoft.Win32.Registry.LocalMachine;
var key = lm.OpenSubKey("Software\\Microsoft\\VisualStudio\\10.0\\EnterpriseTools\\QualityTools\\TestTypes");
var count = key.SubKeyCount;
it will actually read the value of Count=3 instead of Count=2 - what you would expect from reading from HKLM (see the list in 'the problem' section above). And those classes were the normal framework's classes, no custom wrappers. If you inspect te objects with a debugger, it will turn out that they store "HKEY_LOCAL_MACHINE" strings and they *even* call the advapi32.dll:RegOpenKeyEx with 0x8000002 argument, which is, of course, HKEY_LOCAL_MACHINE.
And now, guess what!
When run from within MSTest, the QualityTools package just reads directly from HKLM. The magic is gone. No 'registry subsitution' occurs. And probably, any other package would behave the same.
So what is really going on there?
For every feature, there's a bug
--------------------------------
And the bugs do form the chaos.
The devenv.exe executable that provides the entry point for VisualStudio's IDE contains some very clever, tricky, and simultaneously nasty thing: a win32api hooks. Or rather - hijacks or 'detours', as the Microsoft like to name them lately. Devenv.exe behaves just like traditional virus: it seems to load advapi32.dll, but then it rewrites the import chunks with its own trampolines pointing to completely other code. This approach fools the whole manager .Net framework layer, as it is in not a few places only a pretty paper wrapper over the native methods. Here, the native methods were replaced with custom implementation, and the framework happily invokes `RegOpenKeyExW` and even if it cared to check - it would be completely sure it calls it from advapi32.dll, as the import was actually invoked - but was later redirected elsewhere (by the way, you may want to check on MSDN the Detours library - it uses similar methods).
What's the point of it?
Maybe they wanted to allow the developers to invent their own extensions, but they did not want the developers to 'contaminate' the precious registry with their temporary trash.
Or maybe they wanted to provide the extension developers with an automatic way of handling the layered registry architecture, so they won't have to do it manually?
Or maybe rather, they had few millions of lines of existing code, that was using more-or-less hardcoded HKLM references and now they wanted a way to quickly make the whole configuration per-user and per-named-instance?
Well, whatever. They succeeded only partially: the IDE works smoothly. But in the rest - have failed.
The MSTest tool does not contain ANY of those detours. I didn't check, but I'd bet that most of the tools sitting in Common/IDE, or even, most of the tools that **don't use the vs shell** - does not contain those detours.
This means, that all of those are automatically crippled: they are unable to notice **any** of the available per-user and per-instance settings and will only be able, and will only read the default configuration from the HKLM hive. No magic, no cookies. But it's only my guessing, for now, I have been playing around only with MSTest.
This may seem a small issue not worth calling a bug and writing an article about it. Of course, simply copying the keys and values to the HKLM hive is sufficient, and the QualityTools package loaded by MSTest immediatelly notices the entries and the extra TIP. But how about the Bob? He did not wanted to have it installed. Or even, maybe he has not paid for the fancy new extension? (by the way, the same problem occurs if you drop your binaries to Common/IDE/PrivateAssemblies or PublicAssemblies - they instantly go global and are visible for everyone).
And I think that the devenv.exe really hacks hard into lower layers is worth a note :)
For every bug, there's a facepalm
---------------------------------
Remember the `ExternalSettingsManager` I mentioned earlier? It is public, feel free to play with!
Create a new project, reference `Microsoft.VisualStudio.Settings` (is in GAC or C:\Program Files\Microsoft Visual Studio 2010 SDK SP1\VisualStudioIntegration\Common\Assemblies\v4.0\Microsoft.VisualStudio.Settings.dll), and write there:
using Microsoft.VisualStudio.Settings;
var lm = Microsoft.Win32.Registry.LocalMachine;
var key = lm.OpenSubKey("Software\\Microsoft\\VisualStudio\\10.0\\EnterpriseTools\\QualityTools\\TestTypes");
var count1 = key.SubKeyCount;
var devenvRoot = @"C:\Program Files\Microsoft Visual Studio 10.0\Common7\IDE\devenv.exe";
var extmgr = ExternalSettingsManager.CreateForApplication(devenvRoot);
var settings = extmgr.GetReadOnlySettingsStore(SettingsScope.Configuration);
var count2 = settings.GetSubCollectionCount("EnterpriseTools\\QualityTools\\TestTypes");
The result? Count1 is of course 2, as it reads from HKLM, and Count2 is **three**, as the class seems to perfectly support the registry layered-ness. As you probably noticed, it also can provide write-access, has several 'scopes' you may ask for, and completely abstracts from the "Software\\Microsoft\\VisualStudio\\10.0" part - so probably it handles not only the HKLM/HKCU switching, but also probably handles /rootsuffix instance switching and maybe even VisualStudio versioning. I only wonder why I had to provide it with devenv executable path, I didn't had the time to look at it yet.
*By the way, the `Microsoft.VisualStudio.Settings` seems to contain similar detours to those from the devenv.exe. For a few hours I was even sure that the devenv.exe imports the detours from M.VS.Settings, but as I digged deeper, I'm not so sure now, and unfortunatelly, my free-time pool has depleted.*
So why did the MSTest/QualityTools not use that class? The assembly is loaded into devenv.exe from beginning, QualityTools could use it freely. For me, it seems more natural to use its API when reading the layered-registry, than to hijack the winapi layer.
I can only assume that the advapi32 hijacking present in devenv's native loader was added in as a radical attempt to reduce costs of splitting the registry into several shadowed branches. Surely, revisiting all modules/packages of VisualStudio was far more expensive than that. The `Microsoft.VisualStudio.Settings` assembly was probably created much later and included in the VS2010 SDK only because some people reported issues with chaotic and incoherent registry reads/writes from different GUIs and tools..
]]>
</script>Unknownnoreply@blogger.com0Gdynia, Polska54.5530112 18.510097835.4625707 -21.9195897 73.6434517 58.9397853tag:blogger.com,1999:blog-2353222392640616732.post-56331004946429040292012-04-02T01:32:00.000+02:002013-06-25T13:57:27.134+02:00Few things about MS QualityTools internals in VS2010<script type="text/markdown">
<![CDATA[
As I was previously searching for the fix for `IMessageSink` problem, I was tempted also to check, why the runner does not respond properly when commands like *"Run tests in current context"* or *"Run tests in current class"* were not handled properly. It was very strange, because looking at the code of the runner, everything seemed in a perfect order. The tests were discovered, then were run, the results were displayed - although in somewhat distorted way: the *Owner* column was holding the class name, the *ClassName* column was empty, the *FailureCount* was not recorded at all, etc.
It was obvious that something was not right. The case of *Owner* was trivial - even the original author warned that he couldn't get the *ClassName* filled properly, so he artificially injected the name of the type to the *Owner*, just to be able to see, group, and sort the tests.
I immediatelly thought, that *maybe* the Visual Studio is relying on that very column, the `ClassName`, to navigate between the test and the code, and maybe it is some partial cause why the *"Run in current context"* does not work.
<a name='more'></a>
Overview
--------
The code package responsible for handling the tests is the `Microsoft.VisualStudio.QualityTools`. It dictates the architecture of the test plugins, so for the xUnit test runner, too:
- a class derived from `Package`, that defines the plugin for VS
- a number of classes derived from `TestElement`, that will work both as custom test definitions and as test instances
- a number of classes derived from `Tip`, that is mostly responsible for test discovery
- a number of classes derived from `BaseTuip`, which actually are windows/panels, used for example to display custom test results
- and so on..
The QualityTools package also provides many ready-to-run features for the MSUnit library, and this is why MSUnit-based test are seamlessly integrated and beautifully run from within VisualStudio immediatelly after installation.
The original xUnit provided all of the above and was acting as a bridge between the QualityTools architecture and the xUnit actual test discoverer and actual test runner, which are provided by the xUnit itself. It also defined services and a panel/window class, but they were not really used (correct me if I'm wrong). The runner uses a tricky exploit: although it defines a custom `TestElement` and handles the run on its own, it does pretty nothing to display the custom results obtained. Instead, upon completion of the test, it translates a series of recorded xUnit's `TestResult` objects into a QualityTools' `UnitTestResult` objects of a proper kind, and returns them instead. The `UnitTestResult` is already paired with an existing MSTest result's viewer. This way, the results are displayed as if they were generated by the MSTest unit tests.
It sounds easy, but keep in mind the fact that `UnitTestResult` is an `internal class`. There's no easy nor pretty way to do it. The actual implementation of this employs some "dirty" tricks with Reflection and Linq. They are interesting, but outside of this article's scope.
The TIP: Discovery of Test Elements
----------------------------------
As I mentioned, the `Tip` object (that the Package should register) is responsible for test discovery. It is done by overloading the base `Load` method. This method is provided an assembly name, some project information and is assumed to return a collection of test definitions, the TestElements.
One may notice that the `Tip` constructor is given an instance of `ITmi` object. I suppose that this name is a shorthand for something like *test management interface*. In fact, this the outer brain and the core element of the QualityTools package. The TIP that a test plugin should implement is a merely a plugin for that TMI object. This `ITmi` interface defines a dozen of very handy methods, if ever someone would have to inspect the test lists manually.
For notable example, whenever a test project is rebuilt, the `Tmi.LoadTests` method is called, which in turn calls `LoadTestsFromTipsHelper`, what invokes the `Load` methods from all registered `Tip` instances.
I've noticed that the `TestElement` base class defines a virtual `FillDataRow` method. I instantly thought that this is the point where the ClassName should be filled, and tried to do that - it was impossible. The rows to be filled simply did not contain such column!
If I recall correctly, when the TMI is first initialized, it creates a dummy `TestElement` instance, inspects it for displayable properties, and then creates initial columns for them. This is a closed mechanism, it cannot be extended. Later, when new `TestElement`s are loaded, they are being displayed on that very same tables, and only that base columns are possible to be filled.
FYI:
The core method of there mechanisms is `AddRowAndColumnsToTable`, and the initial columns are initialized as follows:
Type of the table to display the source of the column definitions
Tmi.StorageElementType.Test <- (IVisiblePropertyProvider)dummyTestElement
Tmi.StorageElementType.Category <- (IVisiblePropertyProvider)testListCategory
Tmi.StorageElementType.RunConfig <- (IVisiblePropertyProvider)runConfiguration
Tmi.StorageElementType.ResultCategory <- (IVisiblePropertyProvider)testListCategory
Interestingly, at that point, there are no columns like 'Namespace', 'ClassName', 'StackTrace' etc. Please note that all of this occurs when the test project is built. Noone really wanted anything to be displayed yet!
Thus, another very interesting internal workflow should be the initialization of the test list window. The `ControllerProxy.InitializeTestRun` invokes a very important `Tmi.CreateInitialNotRunResults`. This method enumerates all `TestElements` and inspects their corresponding `Tip`s in order to determine what displayable columns it should prepare. For each `Tip`, an empty dummy `Common.TestResult.TestResult` (the base class of all test results) is created and provided to a `MergeResults` of the `Tip`.
This is the second most important method after the `Load`. It is responsible for gathering partial results into a one final TestResult, and is meant to return a proprietary result subclass, relevant to the actual test type in question. Thus, in contrast to the past initialization, now the TMI inspects not the dummy result object, but it checks instead the object returned from `MergeResults`, iterates over its `VisibleProperties` and creates new columns if necessary.
It may be noteworthy, that the `VisibleProperties` go through a small filtering. The method `VisualPropertyObtainer.IsSupportedByProduct` tests for any interfering licensing attritubes that could have been applied to the test plugin.
This process of building the columns of the test view is important, because this is what initiated my analysis: the *ClassName* column was empty. Please remember that the original code of the xUnit runner was using a trick and it generated original internal UnitTestResults. However, it *did not* implement the `MergeResults` method in its `Tip`, and did not return a correct object (it used a base implementation!), thus the test view did not knew that it needed to inject data columns for such object!
Seeing this, the solution was immediate: I've played with the `VisibleProperties`, `FillDataRow`, `MergeResults` and also thrown my own `TestResult` subclass instead of hacking the internal one - and here there is! The 'ClassName' column got filled properly. Moreover, it seemed to be possible to completely control what and where as displayed.
However, the *'Run from Context'* was still not working.
The VS QualityTools **is not using** information from test list rows to navigate from the code to the tests.
The WTF
-------
Aside from the test discovery and test list building, there are many other interesting workflows inside, for example, the `Controller.ControllerPluginManager.LoadPlugin` which inspects `TestElement.ControllerPlugin` property. Actually, I had no time to check it, but it looks very interesting, as the 'Controller' is in hierarchy a little above the internal TMI object :)
The TIP is contains some extensible parts, but actually, and pitifully, most of the building blocks are in fact internal/sealed/closed and unmodifiable. One of such is the ... yep, the code-to-test mapping.
And what's more, it is essentially broken, at least, from my point of view, as a plugin author.
When the user presses *'Run from Context'* button or menu item, a command of the same name is run, bound to the `QualityToolsPackage.OnRunTestsFromContext`[1], which in turn relays much of the initial work to the `CodeModelHelper.FindActiveCodeElement`[2]. The latter method investigates the current text selection, and returns a CodeElement that most closely relates to it. If no actual "text selection" is made by the user, then the cursor position is used. This method works properly, and the `OnRunTestsFromContext` receives the code element object, checks the mapping and sometimes rebuilds the projects on the fly, and finally passes the code element to another hard-working method, the `QualityToolsPackage.GetTestIdsFromCodeElement`. This method is the core reason of all problems with `Run from context`.
]]>
</script><br />
<div class="post-sidenote-2">
[1] QualityToolsPackage is in Microsoft.VisualStudio.QualityTools.TestCaseManagement.dll<br />
<br />
[2] CodeModelHelper is in Microsoft.VisualStudio.QualityTools.CMI.dll</div>
<script type="text/markdown">
<![CDATA[
The signature of `QualityToolsPackage.GetTestIdsFromCodeElement` has a few important parts:
- parameter: context, enum of type `QualityToolsPackage.RunTestsContext`
- parameter: element, the code element that defines the 'position'
- out parameter: runAllTests, bool
- return: List<testid>, the tests matched to the context
The *enum* is defined as {Default,Disabled,Class,Namespace,All} and it is a way to provide an abstraction level over all the 'Run from ...' commands and allows code to be reused for all of them, at least:
- "Run from context" -> Default
- "Run from namespace" -> Namespace
- "Run from class" -> Class
The *element* parameter is inspected for the 'ElementType', which is an another enum, namely `CodeModelHelpers.CodeElementType`, defined as {Assembly,Namespace,Class,Member}. This presents all of the 'scopes' that the test runner may distinguish at the cursor position.
The *runAllTests* out-parameter is a flag, defaulted to true, that is cleared only if at least one valid `TestElement` is found during the search. This why the test runners really do run all the tests in the solution instead of the ones the user wanted to run *contextually*. This is a brutal fallback, apparently implemented just with the user-friendliness in mind - the implementor probably didn't want to irritate the user with messages "sorry, I did not know what Context you mean".
The `GetTestIdsFromCodeElement` switches over the `ElementType` and performs following steps:
- **Assembly**:
- simply returns nothing; *runAllTests* is set to true, so the engine runs all tests from the assembly.
- **Namespace**:
- reads the namespace from the code element, and immediatelly defaults to a test search loop
- **Class**:
- generates a GUID based on a seed calculated from a string of `Namespace+"."+ClassName`
- asks the TMI to find a `TestElement` with `TestID` exactly equal to the just generated GUID
- if a test is found, returns it as the result
- if not, records the `Namespace` and `ClassName` and defaults to a test search loop
- **Member**:
- generates a GUID based on a seed calculated from a string of `element.FullName`
- asks the TMI to find a TestElement with TestID exactly equal to the just generated GUID
- if a test is found, returns it as the result
- if not, records the `Namespace` and `ClassName` and defaults to a test search loop
The steps above are simplified a bit for readability. The 'Context' parameter must be taken into consideration, too:
- The hashing is performed only when the `Context == Default`; in other cases only the search loop is executed
- The `ClassName` is recorded only if the `Context != Namespace`; thus, if the user wanted *'Run from Namespace'*, all the finegrained searches are skipped
Assuming that none of the quick solutions didn't succeed, a search loop is performed. Actually, there are two: one for the class scope, and one for namespace scope, but they are almost identical. The algorithm is as follows:
var found = new List<testid>();
if (!string.IsNullOrEmpty(className))
foreach (ITestElement testElement in tmi.GetTests())
{
UnitTestElement unitTestElement = testElement as UnitTestElement;
if (unitTestElement != null
&& unitTestElement.ClassName.Equals(str1, StringComparison.Ordinal)
&& unitTestElement.Namespace.Equals(str2, StringComparison.Ordinal))
{
runAllTests = false;
if (testElement.Enabled)
found.Add(testElement.Id);
}
}
While this is absolutely correct from the implementational point of view, please note the cast to `UnitTestElement`. This class is `internal` and belongs to the QualityTools and MSUnit. The properties `ClassName` and `Namespace` are not defined in the base `TestElement` class, because that class is on a higher abstraction level, and it may be used for test that have no such notions. Therefore, an interface `IUnitTestInformation` has been introduced, and this interface defines the code location properties: `FullName`, `Namespace`, `ClassName` and `MethodName`. However, the implementor accessed the properties not via the interface, but by direct cast to the **internal class**, and that **essentially cancels** any our further attempts to enhance our `TestElement` implementations.
Please note that your custom TIP actually can perfectly mimic the hashing algorithm and then it will properly "run-from-context" a single test method or test propery, but still it will fail when asked to run from a class context or namespace context, as they are solely handled by the search loops.
As a side note: the `IUnitTestInformation` interface is `internal`, too. I know it's April's Fools today, but I'm not joking. They literally started with a very extensible architecture, just to shut the most useful bits away in the final lines, at least from unit testing point of view. While conspiracy lovers surely can notice here an ill marketing attempt to promote MSUnit over plugins, I call it **a bug**.
The solution
------------
Part of the solution was already in place. I've already mentioned that the original author of the plugin found a shortcut, and instead of implementing the `TestResults` and results viewer windows, just properly translated the results into the internal objects.
So, why not do the same now?
It turns out that the test package registration really requires to register a new test type (or else you will not be able to register your new TIP instance), it actually completely does not care whether the registered TIP returns tests of that type, or any other type! Let me say that again: the custom TIP may return whatever TestElements it likes, with no respect to the registered test type. Please note that the `TestElement` object has a `TestType` property, and the `TMI.GetTip` method uses that to obtain a TIP for a test. That means, that if our custom TIP generates some `TestElement`s with a `TestType` pointing to an another TIP - then the tests will be handled and processed by that another TIP. This a very similar mechanism to the Adapter property.
That means, that if the TIP implementors want a quick and seamless 'Run from Context' integration, they should abandon implementing their own `TestElement` subclasses at all. Let your `Tip.Load` return the original `internal UnitTestElement`s and everything automagically starts working properly. Yeah, more dirty Reflection work.
The constructor of the `UnitTestElement` is very simple (much simplier that those from `UnitTestResult` class), but requires another internal class, the `TestMethod` (not to be mistaken with the `TestMethod` form xUnit library - although they are almost identical!). Aside from that, fortunatelly, there are almost no caveats related to manually constructing such objects.
Except for three:
- the constructor does not set the `CodeBase` property
- and neither the `Storage` property
- nor the `ProjectData` property!
Of course, all of them must be set up, or else the `TestElement.IsValid` will turn to false and the TMI will ignore that test. Both the `Storage` and `ProjectData` are defined in the base `TestElement` class and are easily accessible, but the `CodeBase` is defined by the internal `UnitTestResult` and can be set only via Reflection. The last thing to note is that setting there properties manually causes the `IsModified` flag to be lit on the `TestElement`, thus it should be manually cleared afterwards. The original code from QualityTools does it just the same way:) Just look at the end of
Microsoft.VisualStudio.TestTools.TestTypes.Unit.VSTypeEnumerator.AddTest
By the way, this method is a beautiful reference on how to spoof, erm I mean, setup a `UnitTestElement` instance. In this method you will find all the details about the meaning of various method- and class attributes that can be defined over a MSUnit test, and how they are mapped to the `UnitTestElement`s configuration. This method handles the .Net, .Net CF, and ASP.Net, so it really is worth a look.
Another attempt way to support the *'Run from Context'* properly could be done, as one can try to implement own *'Run from Context'* command handler. I have not tried it, as I see it completely insane, considering the amount of additional work. Also, I suppose that the original handlers would have to unregistered first, and that may be a little tough. If anyone needs that, here's the registration of the original handlers:
// from the class QualityToolsPackage
CommandHelper.AddCommand(this.m_menuService, new EventHandler(this.OnRunTestsFromContext), new EventHandler(this.QueryStatusRunTestsFromContext), VSEqt.Commands.RunTestsFromContext1);
CommandHelper.AddCommand(this.m_menuService, new EventHandler(this.OnRunTestsFromContext), new EventHandler(this.QueryStatusRunTestsFromContext), VSEqt.Commands.DebugTestsFromContext);
CommandHelper.AddCommand(this.m_menuService, new EventHandler(this.OnRunTestsFromContext), new EventHandler(this.QueryStatusRunTestsFromContext), VSEqt.Commands.RunTestsInClass);
CommandHelper.AddCommand(this.m_menuService, new EventHandler(this.OnRunTestsFromContext), new EventHandler(this.QueryStatusRunTestsFromContext), VSEqt.Commands.DebugTestsInClass);
CommandHelper.AddCommand(this.m_menuService, new EventHandler(this.OnRunTestsFromContext), new EventHandler(this.QueryStatusRunTestsFromContext), VSEqt.Commands.RunTestsInNamespace);
CommandHelper.AddCommand(this.m_menuService, new EventHandler(this.OnRunTestsFromContext), new EventHandler(this.QueryStatusRunTestsFromContext), VSEqt.Commands.DebugTestsInNamespace);
CommandHelper.AddCommand(this.m_menuService, new EventHandler(this.OnRunTestsFromContext), new EventHandler(this.QueryStatusRunTestsFromContext), VSEqt.Commands.DebugAllTests);
CommandHelper.AddCommand(this.m_menuService, new EventHandler(this.OnRunTestsFromContext), new EventHandler(this.QueryStatusRunTestsFromContext), VSEqt.Commands.RunAllTests);
// and a bit later:
MenuCommand command1 = (MenuCommand) new OleMenuCommand(new EventHandler(this.OnRunTestsFromContext), (EventHandler) delegate {}, new EventHandler(this.QueryStatusRunTestsFromContext), VSEqt.Commands.RunTestsFromContext2);
command1.Enabled = true;
this.m_menuService.AddCommand(command1);
The cherry on the cake: a better Runner
---------------------------------------
Of course, I've incorporated the solution to the xUnit test runner, and it works great! Except for the Reflection code, the whole runner has simplified significantly, and quite large sections of the code were removed. Noticeably, the window, service interface and the TUIP were removed completely. They seemed unused before, and now they were completely unused, as the pair of internal `UnitTestElement`/`UnitTestResult` simply uses the original UI from the QualityTools/MSUnit. I've refactored the Reflection code into a separate class `InternalAccess.cs` for easy copying, in case anyone wanted to incorporate the `UnitTestElement`/`UnitTestResult` generation into their own plugin (http://nunitforvs.codeplex.com/workitem/32394 maybe?).
Summarizing all that I've said - let's review what's left in the runner!
- `XUnitTestPackage`, that registers a test type and the TIP
- `XUnitTestTip`, paired to the test type, but ignoring it completely
- `XUnitTestAdapter`
- `XUnitTestRunner`
- and a `XUnitDummyTest`
- `InternalAccess`, for the Reflection stuff
The `XUnitDummyTest` is the previous `XUnitTest`, that now has lost all of its implementation. It is now just a hollow stub, just because the TIP registration needed it.
The `XUnitTestPackage` lost over 60% of its implementation, and now is a hollow class, existing solely for the purpose of having the `[RegisterTestTypeNoEditor]` over itself, just to register the TIP.
The `XUnitTestTip` now just asks the xUnit for a test list, and builds a list of `UnitTestElement`s
The `XUnitTestAdapter` is almost unchanged, and it still relies on the `XUnitTestRunner` to execute the test, and later constructs the `UnitTestResult`s from the actual results.
Also, the `XUnitTestRunner` has almost not changed at all. The only adjustment came from the removal of custom `TestElement`: a full class name must be known along with the name of the method to run, and they must be extracted via Reflection from the `UnitTestElement` that is now in use.
As far for now, except for the Reflection tunnels, the code is actually minute.
The current state of the plugin is currently available at:
https://github.com/quetzalcoatl/xvsr10
Still, please treat is with caution: while it runs beautifully on my machines, and the code is tiny and there aren't many klocs left for bugs, this still is a rather fresh 'product'.
]]>
</script>Unknownnoreply@blogger.com0Gdynia, Polska54.5188898 18.530540954.4451578 18.372612399999998 54.592621799999996 18.6884694tag:blogger.com,1999:blog-2353222392640616732.post-22556529182623344262012-03-26T19:45:00.003+02:002013-06-25T13:57:38.967+02:00Obscure bug in xUnit's .Net Remoting layer vs. XUnitForceLegacyCallback<div class='post-sidenote'>with nods to EmperorXLII and his <a href="http://xunit.codeplex.com/workitem/5648">work for xUnit</a></div>
<div class='post-sidenote'>and of cource, to the whole <a href="http://xunit.codeplex.com">xUnit</a>'s team, too :)</div>
<!-- language-all: cs -->
<script type='text/markdown'><![CDATA[
There's an "unknown incompatibility" problem mentioned in the comments, which forced [the original author](http://www.codeplex.com/site/users/view/EmperorXLII) of [*"VSTS Test Runner for xUnit"*](http://xunit.codeplex.com/workitem/5648) to add the **XUnitForceLegacyCallback** hack was quite tough to trace. It took me more than 8 hours to diagnose and fix, so if you, dear reader, are here already, please at least skim through it and either remember what `IMessageSink` really is, or write down somewhere the helpful external links :))
<a name='more'></a>
Quick recap - the original problem
----------------------------------
The xUnit's test runner was crashing badly when run within the VisualStudio environment (by, for example, *"Run all tests in solution"*), because a `NotSupportedException` was thrown somewhere in the middle. With a bit of bad luck, you could also see RemotingException or SecurityException instead, or some messages about *"Method not supported in a dynamic assembly"*.
The author of [the runner](http://xunit.codeplex.com/workitem/5648) had to inject a hotfix into xUnit's `ExecutorWrapper` that forced the xUnit to use a "legacy callback mechanism", that is, to turn off new messaging layer introduced I think somewhere around xUnit 1.6, in favor of old layer that used handcrafted callbacks. In fact, the xunit.utility.runner's *"version-resilience"* keeps him to hog all the previous APIs and it dynamically switches between them depending on what version of xunit.dll it notices, so the hotfix simply forced him to always fallback to pre-1.6 version. This hotfix is easily noticeable in the VSTS Runner's code, just search for `XUnitForceLegacyCallback`.
The runner is currently not supported by xUnit team, as they look forward for VS2011. However, it does not mean that noone is using VS2010 now. Well, at least I am, and I want to have xUnit right at hand. I was working on some tweaks for [Moq](http://code.google.com/p/moq/) and [AutoMoq](http://blog.ploeh.dk/2010/08/19/AutoFixtureAsAnAutomockingContainer.aspx), and actually I managed to extend Moq with [something quite reasonable](https://github.com/Moq/moq4/pull/4). However, some DLL version collisions forced me to recompile most of the libraries under .Net4.0 and .. it broke [the runner](http://xunit.codeplex.com/workitem/5648) somehow! Diffing the code, I noticed that fresh checkout from the xUnit's repository has removed the fix I was talking about earlier, and so my curiosity ignited.
What
----
After a bit of mindless digging, it turned out that the direct cause was passing some `IMessage` objects across the AppDomain, and the root error message was *"The method was called with a Message of an unexpected type"*. That was the easier part to trace, and it looked quite easy: remote method was passed some message object it could not handle.
When
----
The more tricky part was finding out **which** object was that and why. Fortunatelly, `NotSupportedException`s are not thrown very often and a first-chance exception catcher quickly found out the place:
`ExecutorWrapper.OutgoingMessage`
that was returned from
`XmlNodeCallbackHandlerWithIMessageSink`
and its implementation of
`IMessageSink.SyncProcessMessage`
It is called in the last steps of most of the workflows, wrapped with `OnTestStart/End` methods, for example:
// Executor.OnTestStart:285:
callback.Notify(node.OuterXml) // callback was the XmlNodeCallback
// called from Executor+RunTest.RunTest:286:
executor.RunOnSTAThreadWithPreservedWorkingDirectory(() =>
TestClassCommandRunner.Execute(testClassCommand,
methods,
command => OnTestStart(command, handler),
result => OnTestResult(result, handler)));
Why (teaser)
------------
All the message passing was implemented with `IMessageSink` interface, that accepts any `IMessage`. Surely the OutgoingMessage implemented it. And there are no manual argument assertions in the xUnit's code that would throw `NotSupportedException` when receiving such object. What's more, it was a callback! So a similar `ExecutorCallback.MessageSinkCallback.OutgoingMessage` already have passed properly - all in all the `handler.Notify` caused the remote sink in `XmlNodeCallback` to be called, right? NOT!
The code of the xUnit test runner is *version-resilient* and was recently upgraded from using 'callbacks' to using those two `IMessage` and `IMessageSink` interfaces. In this new version, the proxies and `IMessageSink`s were used 'directly', as in a classic Remoting samples - the xUnit was first **manually casting** the proxy to `IMessageSink`, and then, later, calling IMessageSink.SyncProcessMessages() **manually, too**:
// ExecutorCallback.Wrap():26
IMessageSink messageSink = handler as IMessageSink;
if (messageSink != null)
return new MessageSinkCallback(messageSink);
return new CallbackEventHandlerCallback(handler);
// ExecutorCallback.MessageSinkCallback.Notify():58
OutgoingMessage message = new OutgoingMessage(value);
IMessage response = messageSink.SyncProcessMessage(message); // <----
if (response != null && response.Properties.Contains("data"))
shouldContinue = Convert.ToBoolean(response.Properties["data"]);
Oh boy.. the .Net Remoting's Message Sinks are a litle more complicated than that!
IMessageSink is an **infrastructure interface**!
Why (retrospection)
-------------------
The *xunit.console* and *xunit.gui* use intra-process AppDomain to launch the code, probably because it is quite easy to setup and it seems natural to do so after you read on MSDN all the descriptions about the gained safety and isolation. However, it is <u>tremendously insecure for this use case</u>.
There are a few tiny causes, because of which, for example, the SqlServer allows only '100% verifiable modules' to be
loaded as its plugins. One of them is, what happens if your code under test loads a NATIVE DLL, and what if the dll causes an EPIC NATIVE CRASH(tm)? It's a thing from under the .Net layer, and the core Windows system will simply kill your process. And I mean, whole process, not just AppDomain, can get NATIVELY INSTA-KILLED(tm). Or, depending on your luck, it may survive but with some random parts of it may get rendered unusable, in a more or less controlled manner.
AppDomains provide almost no security against ungraceful native crashes. I have learnt it the hard way in the last few years when I had to develop incredible amounts of wrapper/transport code just because one old native library (whose code I don't have access to and authors were unavailable due to political reasons) liked to trash it's process-static data every few dozens of calls (guess what! that process-static data was initialized only once during module load in DllMain) and what's even worse, at every few hundreds of calls, it faulted so hard, that my application was killed, with no exceptions raised on the .Net side. Puff, DrWatson, and gone. The best thing what could be done about that was a detection of first failure, feature lockdown and displaying apologizing message "please restart me". Shame!
Ok, whining aside, the point is, that if you want your "runner" to be safe against a completely unknown code that may crash in a completely unknown manner, then the only way is either completely prohibit usage of any of the dangerous things (->the SqlServer way) or to use a ...
... and this is why the VisualStudio test runner uses **dedicated worker process**, the QTAgent32, which can happily crash anytime it wants without affecting VisualStudio (well, almost, but it's another story).
Why (gory details)
------------------
So here's what happened:
It turned out that the `SyncProcessMessage` in `XmlNodeCallback` was being given something completely different that
the xUnit's code was expecting. Of course, it was an `IMessage` instance. Looking at that implementation of `SyncProcessMessage`, it's clear that it is assuming that the `IMessage msg` argument will be a (possibly proxized) reference to a `ExecutorCallback.MessageSinkCallback.OutgoingMessage` just sent over the nonexistent wire. And actually,
for the bit earlier call of `EnumerateTests` it is absolutely true!
However, the `EnumerateTests` is anticipated by the VisualStudio to be safe, because it is meant to only call the
__test plugin__'s code, which was assumed to be well written. However, all the other calls, like `RunAssembly` or `RunTests` are not, and calls to them are being passed through a safer channel. Please note the complicated callstack chain, passing through two AppDomains, one process and a few Threads:
<!-- language: lang-none -->
FROM devenv.exe: Worker Thread "Controller: state execution thread for run 5448f460-1b83-4d4f-81b7-41c54b7d738b"
Microsoft.VisualStudio.QualityTools.Common.dll!Microsoft.VisualStudio.TestTools.Common.StateMachine<Microsoft.VisualStudio.TestTools.Common.RunState>.Execute()
THROUGH the remoting
FROM QTAgent32.exe: Worker Thread "Agent: state execution thread for test 'Test1' with id '83ee227c-1684-406a-9d29-d97bb8b20166'"
Microsoft.VisualStudio.QualityTools.AgentObject.dll!Microsoft.VisualStudio.TestTools.Agent.AgentExecution.ExecuteTest(object obj)
Microsoft.VisualStudio.QualityTools.AgentObject.dll!Microsoft.VisualStudio.TestTools.Agent.AgentExecution.ExecuteTest(bool forceSynchronousExecution = false)
Microsoft.VisualStudio.QualityTools.Common.dll!Microsoft.VisualStudio.TestTools.Common.StateMachine<Microsoft.VisualStudio.TestTools.Common.TestState>.Execute()
Microsoft.VisualStudio.QualityTools.AgentObject.dll!Microsoft.VisualStudio.TestTools.Agent.AgentExecution.TestStateStarted()
Microsoft.VisualStudio.QualityTools.ExecutionCommon.dll!Microsoft.VisualStudio.TestTools.Execution.ExecutionUtilities.StartNewThread(System.Threading.ParameterizedThreadStart parameterizedThreadStart, System.Threading.ApartmentState apartmentState, int maxStackSize, object parameter, string threadName)
THROUGH thread change
INSIDE QTAgent32.exe: Worker Thread "Agent: adapter run thread for test 'Test1' with id '83ee227c-1684-406a-9d29-d97bb8b20166'"
Microsoft.VisualStudio.QualityTools.AgentObject.dll!Microsoft.VisualStudio.TestTools.Agent.AgentExecution.CallAdapterRunMethod(object obj)
XUnitForVS.dll!XUnitForVS.UnitTestAdapter.Run(Microsoft.VisualStudio.TestTools.Common.ITestElement testElement = {XUnitForVS.UnitTest}, Microsoft.VisualStudio.TestTools.Execution.ITestContext testContext = {Microsoft.VisualStudio.TestTools.Agent.TestContext})
XUnitForVS.dll!XUnitForVS.UnitTestRunner.ExecuteTest(Xunit.IExecutorWrapper executor = {Xunit.ExecutorWrapper}, System.Guid runId = {System.Guid}, XUnitForVS.UnitTest test = {XUnitForVS.UnitTest})
xunit.runner.utility.dll!Xunit.TestRunner.RunTest(string type, string method)
xunit.DLL!Xunit.Sdk.Executor.RunTest..ctor(Executor executor, string _type, string _method, object _handler)
xunit.DLL!Xunit.Sdk.Executor.RunTests..ctor(Executor executor, string _type, List<string> _methods, object _handler)
THROUGH thread change
INSIDE QTAgent32.exe: Worker Thread "xUnit.net STA Test Execution Thread"
xunit.DLL!Xunit.Sdk.Executor.ThreadRunner(object threadStart = {Method = Implicit function evaluation is turned off by user.})
xunit.DLL!Xunit.Sdk.Executor.RunTests..ctor.AnonymousMethod__f()
xunit.DLL!Xunit.Sdk.TestClassCommandRunner.Execute(Xunit.Sdk.ITestClassCommand, System.Collections.Generic.List<Xunit.Sdk.IMethodInfo>, System.Predicate<Xunit.Sdk.ITestCommand>, System.Predicate<Xunit.Sdk.ITestResult>)
xunit.DLL!Xunit.Sdk.Executor.RunTests..ctor.AnonymousMethod__10(Xunit.Sdk.ITestCommand command = {Xunit.Sdk.TimedCommand})
xunit.DLL!Xunit.Sdk.Executor.OnTestStart(Xunit.Sdk.ITestCommand command = {Xunit.Sdk.TimedCommand}, Xunit.Sdk.ExecutorCallback callback = {Xunit.Sdk.ExecutorCallback.MessageSinkCallback})
xunit.DLL!Xunit.Sdk.ExecutorCallback.MessageSinkCallback.Notify(string value)
THROUGH the remoting
FROM QTAgent32.exe: Worker Thread "xUnit.net STA Test Execution Thread" [different AppDomain than the above]
ExecutorWrapper.XmlNodeCallbackHandlerWithIMessageSink....IMessageSink.SyncProcessMessage(IMessage msg)
And here's a dump from the `IMessage msg` argument that the `SyncProcessMessage` was provided with.
<!-- language: lang-none -->
IMessageSink.SyncProcessMessage(IMessage msg)
msg = System.Runtime.Remoting.Messaging.MethodCall [implements IMethodCallMessage]
ArgCount 1 int
argMapper null System.Runtime.Remoting.Messaging.ArgMapper
args {object[1]} object[]
{1#} [0] {__TransparentProxy}
callContext {System.Runtime.Remoting.Messaging.LogicalCallContext}
ExternalProperties {System.Runtime.Remoting.Messaging.MCMDictionary}
fSoap false
fVarArgs false
HasVarArgs false
identity null System.Runtime.Remoting.Identity
InArgCount 1 int
InArgs {object[1]} object[]
{1#} [0] {__TransparentProxy}
instArgs null System.Type[]
InternalProperties {System.Collections.Hashtable, Count=0}
LogicalCallContext {System.Runtime.Remoting.Messaging.LogicalCallContext}
MethodBase {System.Reflection.RuntimeMethodInfo}
-> MethodName "SyncProcessMessage"
methodName "SyncProcessMessage"
methodSignature {System.Type[1]}
MethodSignature {System.Type[1]}
MI {System.Reflection.RuntimeMethodInfo}
Properties {System.Runtime.Remoting.Messaging.MCMDictionary}
msg.Properties["__Uri"] -> "/7398fd15_73d1_495a_9057_32080c85b197/zmwu73vp7yrx9qkmpxacddum_6.rem"
-> msg.Properties["__MethodName"] -> "SyncProcessMessage"
msg.Properties["__MethodSignature"] -> {System.Type[1]}
msg.Properties["__TypeName"] -> "System.Runtime.Remoting.Messaging.IMessageSink, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
msg.Properties["__Args"] -> {object[1]}
{1#} [0] {__TransparentProxy}
msg.Properties["__CallContext"] -> {System.Runtime.Remoting.Messaging.LogicalCallContext}
srvID {System.Runtime.Remoting.ServerIdentity}
typeName "System.Runtime.Remoting.Messaging.IMessageSink, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
TypeName "System.Runtime.Remoting.Messaging.IMessageSink, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
uri "/7398fd15_73d1_495a_9057_32080c85b197/zmwu73vp7yrx9qkmpxacddum_6.rem"
Uri "/7398fd15_73d1_495a_9057_32080c85b197/zmwu73vp7yrx9qkmpxacddum_6.rem"
Why (grande finale)
-------------------
So the `XmlNodeCallback` was given a `System.Runtime.Remoting.Messaging.MethodCall`, nothing what the author of that code expected!
And, more importantly, this was an **infrastructure message**!
From [MSDN on IMessageSink](http://msdn.microsoft.com/en-us/library/system.runtime.remoting.messaging.imessagesink.aspx):
> A remote method call is a message that goes from the client end to the server end and possibly back
> again. As it crosses remoting boundaries on the way, the remote method call passes through a chain
> of IMessageSink objects. Each sink in the chain receives the message object, performs a specific operation,
> and delegates to the next sink in the chain. **The proxy object contains a reference to the first IMessageSink
> it needs to use to start off the chain.**
Please look at code at the bottom of the article and at the message types they are anticipating in the sink method for a handmade proxy: `IMethodCallMessage` and `IMethodReturnMessage`. The former is exactly what the `XmlNodeCallback` was provided with: the lower-layer `IMethodCallMessage`. What's more, it was 100% properly filled with all the data: it specified that a *SyncProcessMessage* is to be called with *"args={1#}:__TransparentProxy"* and guess what!
- this proxy pointed out right to the `OutgoingMessage` from `ExecutorCallback`!
At this point, for me, it was obvious that the remoting layer somehow assumed that this proxy has a "IMessageSink chain", in the global 'transport' sense, and helpfully tried to filter/process the message through the chain.
Looking at the sample at MSDN, it is obvious that they have their sink built by the framework, and neither their remote object nor their handmade proxy does implement the `IMessageSink`. From this article, it __may seem safe__ to implement and use `IMessageSink` to your own solutions. Nothing indicates that such behaviour may (or is meant to!) occur.
However, it occurs at least in the QTAgent32.
So, I've checked also:
- [WROX: .NET Remoting Architecture](http://p2p.wrox.com/content/articles/net-remoting?page=0,5)
- [MSDN: Sinks and Sink Chains](http://msdn.microsoft.com/en-us/library/tdzwhfy3(v=vs.100).aspx)
- [Lluis Sanchez page](http://primates.ximian.com/~lluis/samples/InterceptorChannel.cs)
- [Napstronic: NETRemoting](http://www.diranieh.com/NETRemoting/InsideTheFramework.htm)
and noone mentions any case of the RemoteObject implementing `IMessageSink`, but at least the last link provides a **very detailed** description of the sink chain architecture. Looking at the serverside sink (it was an INCOMING call!), the remote object could have been mistakenly taken as a:
- CrossContextChannel -- nope, as there was `CrossContextChannel.SyncProcessMessageCallback` higher in the callstack
- ServerContextTerminatorSink -- nah, as `ServerObjectTerminatorSink.SyncProcessMessage` was __direct caller__ of our failing SyncProcessMessage!
- LeaseSink -- possible, but the `InitializeLifetimeService` returns null
- ServerObjectTerminatorSink -- possible, but the article says 'from message's serveridentity'
- StackBuilderSink -- possible, but as the article says 'from message's serveridentity'
Actually, I think someone may find interesting that the nearest few callstack entries were:
<!-- language: lang-none -->
00:> xunit.runner.utility.dll!Xunit.ExecutorWrapper.XmlNodeCallbackHandlerWithIMessageSink.System.Runtime.Remoting.Messaging.IMessageSink.SyncProcessMessage(System.Runtime.Remoting.Messaging.IMessage msg = {System.Runtime.Remoting.Messaging.MethodCall}) Line 415 C#
-1: mscorlib.dll!System.Runtime.Remoting.Messaging.ServerObjectTerminatorSink.SyncProcessMessage(System.Runtime.Remoting.Messaging.IMessage reqMsg) + 0xad bytes
-2: mscorlib.dll!System.Runtime.Remoting.Messaging.ServerContextTerminatorSink.SyncProcessMessage(System.Runtime.Remoting.Messaging.IMessage reqMsg) + 0x8a bytes
-3: mscorlib.dll!System.Runtime.Remoting.Channels.CrossContextChannel.SyncProcessMessageCallback(object[] args) + 0x94 bytes
-4: mscorlib.dll!System.Threading.Thread.CompleteCrossContextCallback(System.Threading.InternalCrossContextDelegate ftnToCall, object[] args) + 0x8 bytes
... [Native to Managed Transition]
Why (epilogue)
--------------
I wouldn't be myself, if I hadn't checked the implementation of the direct caller for some hints:
// mscorlib.dll:System.Runtime.Remoting.Messaging.ServerObjectTerminatorSink.SyncProcessMessage
[SecurityCritical]
public virtual IMessage SyncProcessMessage(IMessage reqMsg)
{
IMessage message;
IMessage message1 = InternalSink.ValidateMessage(reqMsg);
if (message1 == null)
{
ServerIdentity serverIdentity = InternalSink.GetServerIdentity(reqMsg);
ArrayWithSize serverSideDynamicSinks = serverIdentity.ServerSideDynamicSinks;
if (serverSideDynamicSinks != null)
{
DynamicPropertyHolder.NotifyDynamicSinks(reqMsg, serverSideDynamicSinks, false, true, false);
}
IMessageSink serverObject = this._stackBuilderSink.ServerObject as IMessageSink; // <--- HERE, source of 'serverObject'
if (serverObject == null)
{
message = this._stackBuilderSink.SyncProcessMessage(reqMsg);
}
else
{
message = serverObject.SyncProcessMessage(reqMsg); // <---- HERE, the calling line
}
if (serverSideDynamicSinks != null)
{
DynamicPropertyHolder.NotifyDynamicSinks(message, serverSideDynamicSinks, false, false, false);
}
return message;
}
else
{
return message1;
}
}
The 'ServerObject' is of course the 'RemoteObject' that was exposed via remoting -- the ours `XmlNodeCallback`. So it is not a bug, but a cool but quite obscure feature of the .Net Remoting, available only on the server side: `XmlNodeCallbackHandlerWithIMessageSink` has been taken as a `StackBuilderSink` overrider.
Revisiting [MSDN: Sinks and Sink Chains](http://msdn.microsoft.com/en-us/library/tdzwhfy3(v=vs.100).aspx)
> Custom Channel Sinks
> On the client, custom channel sinks are inserted into the chain of objects between the formatter sink and the last transport sink. Inserting a custom channel sink in the client or server channel **enables you to process the IMessage** at one of the following points:
>
> - During the process by which a call represented as a message is converted into a stream and sent over the wire.
> - During the process by which a stream is taken off the wire and sent to **the last message sink before the remote object** on the server or the proxy object (on the client).
>
> Custom sinks can read or write data (depending if the call is outgoing or incoming) to the stream and add additional information to the headers where desired. At this stage, the message has already been serialized by the formatter and cannot be modified. When the message call is forwarded to the transport sink at the end of the chain, the transport sink writes the headers to the stream and forwards the stream to the transport sink on the server using the transport protocol dictated by the channel.
Seriously, from this description, I wouldn't think it is done this way.
So, how about a fix?
--------------------
Having said all the above, the solution is trivial: the only thing needed is to refactor everything to cease using the mscorlib's `IMessageSink` and use an identical custom interface with different name, similarly to what was before the upgrade.
But, "only refactor" sometimes takes years. In this case, I assume that it can take some time, I myself didn't have enough free time neither.
So, there's also a quicker hot fix: if a `MethodCall` arrives -- just dispatch it properly!
Remember the site [Lluis Sanchez page](http://primates.ximian.com/~lluis/samples/InterceptorChannel.cs) I mentioned earlier? Check its `TypeTerminatorSink`:
public IMessage ExecuteMessage (IMethodCallMessage call)
{
MarshalByRefObject target = InterceptionManager.GetObject (call.Uri);
if (target == null) throw new InvalidOperationException ("Object for uri '" + call.Uri + "' not found");
return RemotingServices.ExecuteMessage (target, call); // <---- HERE
}
So, to fix the problem, in the xUnit's `XmlNodeCallbackHandlerWithIMessageSink`, only a single sanity check has to be added:
IMessage IMessageSink.SyncProcessMessage(IMessage msg)
{
if (msg is IMethodCallMessage)
return System.Runtime.Remoting.RemotingServices.ExecuteMessage(
this, (IMethodCallMessage)msg);
and also it needs to be copied a little above, to a sister `IntCallbackHandlerWithIMessageSink`.
I'll probably post an those fixes to the xUnit's issue tracker in a few days.
PS. If anyone <u>from xUnit's team</u> is reading this by chance:
If are going to stick with the `IMessageSink` and this hot fix, I suggest also borrowing `AsyncProcessMessage` implementation from [MSDN: Sinks and Sink Chains](http://msdn.microsoft.com/en-us/library/tdzwhfy3(v=vs.100).aspx) as the IMessageSink objects are said to be **required**, it is explicitely written right in this article:
> Notes to Implementers
> It is important to note that code implementing the current interface must provide implementations for both SyncProcessMessage and AsyncProcessMessage, since synchronous calls can be converted to asynchronous calls and vice versa. Both methods must be implemented, even if the sink does not support asynchronous processing.
And more importantly, the `IMessageSink` is **disallowed to throw any exceptions** (and now there are some NotImplemented ones in place of the async part of it). If you need a proof for the no-exceptions clause, I can find a link for it too.
At some point of time I'll probably post a patch with the async part, too, or with the better fix (removing the `IMessageSink` in favor of custom interface), but don't wait for me if the grue eats me.
HTH, q
]]></script>Unknownnoreply@blogger.com0Gdynia, Polska54.5188898 18.530540954.4451578 18.372612399999998 54.592621799999996 18.6884694tag:blogger.com,1999:blog-2353222392640616732.post-8532617785148148522012-03-25T12:50:00.000+02:002012-03-26T19:54:43.136+02:0012:50, press ReturnUnknownnoreply@blogger.com0Gdynia, Polska54.5188898 18.530540954.4451578 18.372612399999998 54.592621799999996 18.6884694